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METHODS OF DIAGNOSIS OF LUNG CANCER, COMPOSITIONS AND METHODS 
OF SCREENING FOR MODULATORS OF LUNG CANCER 

CROSS-REFERENCES TO RELATED APPLICATIONS 

This application is related to USSN 60/284,770, filed April 18, 2001; USSN 
60/290,492, filed May 10, 2001; USSN 60/334,370, filed November 29, 2001; USSN 
60/339,245, filed November 9, 2001; USSN 60/350,666, filed November 13, 2001; and 
USSN 60/xxx,xxx, filed April 12, 2002 (Docket OMNI-002P); each of which is incorporated 
herein by reference in its entirety. 

FIELD OF THE INVENTION 
The invention relates to the identification of nucleic acid and protein expression 
profiles and nucleic acids, products, and antibodies thereto that are involved in lung cancen 
and to the use of such expression profiles and cbn^ositions in diagnosis and therapy of lung 
cancer. The invention furth^ relates to methods for identifying and using agents and/or 
targets that inhibit lung cancer or related conditions. 

BACKGROUND OF THE INVENTION 

Lung cancer is the second most commonly occurring cancer in the United States and 
is the leading cause of cancer-related death. It is estimated that there axe over 160,000 new 
cases of lung cancer in the United States every year. Of those who are diagnosed with lung 
cancer, 86 percent will die within five years. Lung cancer is the most common visceral 
cancer in men and accounts for nearly one Ihird of all cancer deaths in both men and women. 
In feet, lung cancer accounts for 7% of all deaths, due to any cause, in both men and women. 

Smoking is the primary cause of lung cancer, with more than 80% of lung cancers 
resulting fi*om smoldng. About 400 to 500 separate gaseous substances are present in the 
smoke of a non-filter cigarette. The most noteworfliy substances include nitrogen oxides, 
hydrogen cyanide, formaldehyde, benzene, and toluene. The particles present in cigarette 
smoke contain at least 3,500 individual compounds such as nicotine, tobacco alkaloids 
(nomicotine, anatabine, anabasine), polycyclic aromatic hydrocarbons (e.g., benzo(a)pyrene, 
B(a)P), naphthalenes, aromatic amines, phenols, and tobacco-specific nitrosamines. 
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Tobacco-specific nitiDsamines are formed during tobacco curing and processing, and 

are suspected of causing lung cancer in humans. In rodent studies, regardless of the where or 
how it is applied, tho tobacco-specific nitrosamine known as NNK produces lung adenomas 
and lung adenocarcinomas. The tobacco-specific nitrosamine known as NNAL also produces 
5 limg adenocarcinomas in rodents. 

Many of the chemicals found iq cigarette smoke also affect the nonsmoker inhaling 
"secondhand" or sidestream smoke. Indeed, the smoke inhaled by non-smokers has a 
chemical composition similar to the smoke inhaled by smokers, but, importantly, the 
concentrations of the carcinogenic tobacco-specific nitrosamines are present in higher 
10 concentrations in second hand smoke. For this and other reasons, "passive smokmg" is an 
important cause of lung cancer, causing as many as 3,000 lung cancer deaflis in nonsmokers 
each year. 

Ill addition to smokmg, other fectors fliought to be causes of lung cancer include on- 
the-job exposure to carcinogens such as asbestos and uranium, exposure to chemical hazards 

1 5 such as radon, polycyclic aromatic hydrocaibons, chromium, nickel, and inorganic arsenic, 
genetic fectors, and diet 

Histological classification of various lung cancers define the types of cancer that 
begin in the lung. See, e,g., Travis, et al. (1999) ^igto^Qg^cal Typing pf ^unR and Pleural 
Tumours (Intemational Histological Classification of Tumours, No 1. Four major cell types 

20 make up more ttian 88% of all primary lung neoplasms. These are: squamous or epidermoid 
carcinoma, small cell (also called oat cell) carcinoma, adenocarcinoma, and large cell (also 
called large cell anaplastic) carcinoma. The remainder include undifferentiated carcinomas, 
carcinoids, bronchial gland tumors, and other rarer types. The various cell types have 
different natural histories and responses to ther^y, and, thus, a correct histologic diagnosis is 

25 the first step of effective treatment. 

Small cell lung cancer (SCLC) accounts for 18-25% of all lung cancers, and occurs 
less fi^quently than non-small cell lung cancers, and generally spread to distant organs more 
rapidly than non-small cell lung cancer. In general, at the time of presentation small cell lung 
cancers have ah^ady spread beyond the beyond the bounds where surgery and curative intent 

30 can be undertaken, Hoever, if identified early enough, these cancers are often responsive to 
chemotherapy and thoracic radiation treatment. 

Non-small cell lung cancers (NSCLC) are the more firequently occurring form of lung 
cancer. They comprise squamous cell carcinoma, adenocarcinoma, and large cell carcinoma 
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and account for more than 75% of all lung cancers. Non-small cell tumors that are localized 
at the time of presentation can sometimes be cured with surgery and/or radiotherq>y, but 
usually are not identified until significant metastasis has occurred, which are typically not 
very responsive to surgical, chemotherapy, or radiation treatment. 
5 The screening of asymptomatic persons at high risk for lung cancer has often proven 

ineflfective. In general, only 5 to 15 percent of lung cancer patients have their disease 
detected while they are asymptomatic. Of course, early detection and treatment are critical 
factors in the fight against lung cancer. The average survival rate is 49% for those whose 
cancer is detected early, before the cancer has spread &om the lung. Lung cancer often 

10 spreads outside of the lung, and it may have spread to the bones or brain by the time it is 
diagnosed. While the prognosis may be better for lung cancers that are detected early, 
because of the lack ofv effective curative treatments, early detection does not necessarily alter 
file total death rate &om lung cancer. 

Thus, methods for diagnosis and prognosis of lung cancer and effective treatment of 

15 lung cancer would be desirable. Accordingly, provided herein are methods that can be used 
in diagnosis and prognosis of lung cancer. Further provided are methods that can be used to 
screen candidate therapeutic agents for the ability to modulate, e.g., treat, lung cancer. 
Additionally, provided herein are molecular targets and compositions for therapeutic 
intervention in lung disease and other metastatic cancers. 

20 

SUMMARY OF TBE INVENTION 
The present invention provides nucleotide sequences of genes that are up- and down- 
regulated in lung cancer cells. Such genes are usefiil for diagnostic purposes, and also as 
targets for screening for therapeutic compounds that modulate lung cancer, such as 

25 antibodies. The mefliods of detectmg nucleic acids of the invention or their encoded proteins 
can be used for a number of purposes. Examples include early detection of lung cancers, 
monitoring and early detection of relapse following treatment of lung cancers, monitoring 
response to therapy of lung cancers, determining prognosis of lung cancers, directing therapy 
of lung cancers, selecting patients for postoperative chemotherapy or radiation therapy, 

30 selecting therapy, determining tumor prognosis, treatment, or response to treatment, and early 
detection of precancerous lesions of the lung. Examples of benign or precancerous lesions 
include: atelectasis, emphysema, brochitis, chronic obstmctive puhnonaiy disease, fibrosis, 
hypersensitivity pneumonitis (HP), interstitial puhnonaiy fibrosis (IPF), asthma, and 
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bronchiectasis. Other aspects of the invention will become apparent to the skilled artisan by 
the following description of the invention. 

In one aspect, the present invention provides a method of detecting a lung cancer- 
associated transcript in a cell firom a patient, the method comprising contacting a biological 
5 sample from the patient with a polynucleotide that selectively hybridizes to a sequence at 
least 80% identical to a sequence as shown in Tables 1 A-16. Altematively, the sample may 
be contacted with a specific binding reagent, e.g., antibody. 

In one embodiment, the polynucleotide selectively hybridizes to a sequence at least 
95% identical to a sequence as shown in Tables lA-16. In another embodiment, the 
10 polynucleotide comprises a sequence as shown in Tables lA-16. 

In one embodiment, ttie biological sample is a tissue sample, or a body fluid In 
another embodiment, the biological sample comprises isolated nucleic acids, e.g., mRNA. 

In one embodiment, the polynucleotide is labeled, e.g., with a fluorescent label. In 
one embodiment, the polynucleotide is immobilized on a solid sur&ce. In one embodiment, 
15 the patient is imdergoing a ther^eutic regimen to treat lung cancer. In another embodiment, 
the patient is suspected of having lung cancer. In one embodiment, the patient is a primate, 
e.g., ahumaii. 

In one embodiment, the me&od further comprises fixe step of amplifying nucleic acids 
before the step of contacting the biological SBxnplc with the polynucleotide. 

20 ii another aspect, the present invention provides a mefliod of monitoring the efficacy 

of a therapeutic treatment of lung cancer, the method comprising the steps of: (i) providing a 
biological sample from a patient undergoing the therapeutic treatment; and (ii) determining 
the level of a lung cancer-associated transcript in the biological sample by contacting the 
biological sample with a polynucleotide that selectively hybridizes to a sequence at least 80% 

25 identical to a sequence as shown in Tables 1 A-16, thereby monitoring the efficacy of the 
therapy. Or the sample may be evaluated for protein, e.g., contacting the sample with an 
antibody. 

In one embodiment, the method ftirther comprises the step of: (iii) comparing the 
level of the lung cancer-associated transcript to a level of the lung cancer-associated 
30 transcript in a biological sample from the patient prior to, or earlier in, the therapeutic 
treatment. Or the sample may be evalated for comparison of protein. 

In another aspect, the present invention provides a method of monitoring the efficacy 
of a therapeutic treatment of lung cancer, the method comprising the steps of: (i) providing a 
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biological sample from a patient undergoing the therapeutic treatment; and (ii) detennining 
the level of a lung cancer-associated antibody in the biological sample by contacting the 
biological sample with a polypeptide encoded by a polynucleotide that selectively hybridizes 
to a sequence at least 80% identical to a sequence as shown in Tables .1 A-16» wherein the 
5 polypeptide specifically binds to the lung cancer-associated antibody, thereby monitoring the 
eflScacy of the therapy. 

In one embodiment, the method further comprises the step of: (iii) comparing the 
level of the lung cancer-associated antibody to a level of the lung cancer-associated antibody 
in a biological sample from the patient prior to, or earlier m, the ther^eutic treatment. 

10 In another aspect, the present invention provides a mefliod of monitoring the efBcacy 

of a therapeutic treatment of lung cancer, the method conq)rising the stq)s of: (i) providing a 
biological sample from a patient undergoing the therapeutic treatment; and (ii) determining 
the level of a lung cancer-associated polypeptide in the biological sample by contacting the 
biological sanq>le with an antibody, wherein the antibody specifically binds to a polypq)tide 

15 encoded by a polynucleotide that selectively hybridizes to a sequence at least 80% identical 
to a sequence as shown in Tables lA-16, thereby monitoring the efiBcacy of flie therapy. 

In one embodiment, the method further comprises the step of: (iii) comparing fhe 
level of the lung cancer-associated polypeptide to a level of the lung cancer-associated 
polypeptide in a biological sanq)le fiom the patient prior to, or earlier in, the therapeutic 

20 treatment. In one aspect, the preset invention provides an isolated nucleic acid molecule 
consisting of a polynucleotide sequence as shown in Tables 1A*-16. In one embodiment, an 
e3q>ression vector or cell comprises the isolated nucleic add. In one aspect, the present 
invention provides an isolated polypeptide which is encoded by a nucleic acid molecule 
having polynucleotide sequence as shown in Tables 1 A-16. 

25 In another aspect, the present invention provides an antibody that specifically binds to 

an isolated polypeptide which is encoded by a nucleic acid molecule having polynucleotide 
sequence as shown in Tables 1 A-16. In one embodiment, the antibody is conjugated to an 
effector component, e.g., a fluorescent label, a radioisotope or a cytotoxic chemical. In one 
embodiment, the antibody is an antibody fragment. In another embodiment, the antibody is 

30 humanized 

In one aspect, the present invention provides a method of detecting lung cancer in a a 
patient, the method comprising contacting a biological sample fiom the patient with an 
antibody or protein as described herein. 
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In another aspect, the present invention provides a method of detecting antibodies 
specific to a lung cancer gene in a patient, the method comprising contacting a biological 
sample from the patient with a polypeptide encoded by a nucleic acid comprises a sequence 
from Tables lA-16. 

5 In another aspect, the present invention provides a method for identifying a compound 

that modulates a lung cancer-associated polypeptide, the method comprising the steps of: (i) 
contacting the confound with a lung cancer-associated polypeptide, the polypeptide encoded 
by a polynucleotide that selectively hybridizes to a sequence at least 80% identical to a 
sequence as shown in Tables 1 A-16; and (ii) determining the functional effect of the 

1 0 compound upon the polypeptide. 

In one embodiment, tiie functional effect is a physical effect, an enzymatic effect, or a 
chemical effect In one embodiment, the polypeptide is expressed in a eukaryotic host cell or 
cell membrane. In another embodiment, the polypeptide is recombinant In one 
embodiment, fhe fimctional effect is determined by measuring ligand binding to the 

15 polypeptide. 

In another aspect, the present invention provides a method of inhibiting proliferation 
or another critical process of a lung cancer-associated cell to treat lung cancer in a patient, the 
method comprising the step of administering to the subject a thers^eutically effective amount 
of a compound identified as described herein. In one embodiment, the compound is an 
20 antibody. 

In another aspect, the present invention provides a drug screwing assay comprising 
the steps of: (i) administering a test conq)ound to a mammal having lung cancer or a cell 
isolated therefrom; (ii) comparing tiie level of gene expression of a polynucleotide that 
selectively hybridizes to a sequence at least 80% identical to a sequence as shown in Tables 
25 1 A-16 in a treated cell or mammal with the level of gene expression of the polynucleotide in 
a control cell or manamal, wherein a test compound that modulates the level of expression of 
the polynucleotide is a candidate for the treatment of lung cancer. 

In one embodiment, the control is a mammal with lung cancer or a cell therefrom that 
has not been treated with the test compound. In another embodiment, the control is a normal 
30 cell or mammal, or a non-malignant lung disease. 

in another aspect, the present invention provides a method for treating a mammal 
having lung cancer comprising administering a compound identified by the assay described 
herein. 



6 



wo 02/086443 PCTAJS02/12476 

In anofter aspect, die present invention provides a phaimaceutical conq)osition for 
treating a mammal having lung canc^, the composition comprising a conq)ound identified by 
the assay described herem and a physiologically acceptable excipient. 

5 ' DETAILED DESCRIPTION OF TEffilhrVENTION 

In accordance with the objects outlined above, the present invention provides novel 
methods for diagnosis and treatment of lung disease or cancer, as well as methods for 
screening for compositions which modulate lung cancer. ^Treatment, monitoring, detection 
or modulation of lung disease or cancer'' includes treatment, monitoring, detection, or 

10 modulation of lung disease in those patients who have lung disease (whether malignant or 
non-malignant, e.g., emphysema, bronchitis, or fibrosis) as well as patients with lung cancers 
in which gene e3q>ression fiom a gene in Tables lA-16 is increased or decreased, indicating 
that the subject is more likely to have disease. In particular,while these targets are identified 
primarily fi-om lung cancer samples, these same targets are likely to be similarly found in 

15 analyses of other medical conditions. These other conditions may result fiom similar 
pathological processes which affect similar tissues, e.g., lung cancer, small cell lung 
^ carcinoma (oat cell carcinoma), non-small cell carcinomas (e.g., squamous cell carcinoma, 
adenocarcinoma, large cell lung carcinoma, carcinoid, granulomatous), fibrosis (idiopathic 
pulmonary fibrosis (IFF), hypersensitivity pneumonitis (BP), interstitial pneumonitis, 

20 nonspecific idiopathic pneumonitis (NSIP)), chronic obstractive pulmonary disease (COPD, 
e.g., emphysema, chronic bronchitis), asthma, bronchiectasis, and esophageal cancer. See, 
e.g., the NCI webpage and USSN 60/347,349 and USSN 60/xxx,xxx (docket LFBR-OOl-lP, 
filed March 29, 2002), each of which is incorporated herein by reference. The treatment may 
be of lung cancer or related condition itself, or treatment of metastasis. 

25 In particular, identification of markers selectively expressed on these cancers allows 

for use of that expression in diagnostic, prognostic, or ther^eutic methods. As such, the 
invention defines various compositions, e.g., nucleic acids, polypeptides, antibodies, and 
small molecule agonists/antagonists, which will be usefiil to selectively identify those 
markers. For example, therapeutic methods may take the form of protein therapeutics which 

30 use the marker expression for selective localization or modulation of fimction (for those 
markers which have a causative disease efifect), for vaccines, identification of binding 
partners, or antagonism, e.g., using andsense or RMAi. The markers may be usefiil for 
molecular characterization of subsets of lung diseases, which subsets may actually require 
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very different treatments. Moreover, the markers may also be important in related diseases to 
the specific cancers, e.g., which affect similar tissues in non-malignant diseases, or have 
similar mechanisms of induction/maintenance. Metastatic processes or characteristics may 
also be targeted. Diagnostic and prognostic uses are made available, e.g., to subset related 
5 but distinct diseases, or to deteraiine treatment strategy. The detection methods may be based 
upon nucleic add, e.g., PGR or hybridization techniques, or protein, e.g., ELISA, imaging, 
BBC, etc. The diagnosis may be qualitative or quantitative, and may detect increases or 
decreases in expression levels. 

Tables 1 A-16 provide unigene cluster identification numbers for the nucleotide 

10 sequence of genes that exhibit increased or decreased expression in lung cancer san5)les. The 
tables also provide an exemplar accession number that provides a nucleotide sequence that is 
part of the unigene cluster. In Table lA, genes marked as **target V or 'target 2" are 
particularly useM as therapeutic targets. Genes marked as **target 3" are particularly useful 
as di£^ostic markers. Genes marked as "chron"' are upregulated in chronically diseased lung 

15 (e.g., emphysema, bronchitis, fibrosis) relative to lung tumors and nomial tissue. In certain 
analyses, the ratio for tiie "chron" category was detemiined using the 70th percentile of 
chronically diseases lung samples divided by ihe 90th percentile of normal lung samples. 
The ratio for the targets was determined using the 70th percentile of lung tmnor samples 
divided by the 90th percmtile of normal lung samples. 

20 

Definitions 

The term "lung cancer protein" or "lung cancer polynucleotide" or "lung cancer- 
associated transcript" refers to nucleic acid and polypeptide polymorphic variants, alleles, 
mutants, and interspecies homologs that: (1) have a nucleotide sequence that has greater than 

25 about 60% nucleotide sequence identity, 65%, 70%, 75%, 80%, 85%, 90%, preferably 91%, 
92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% or greater nucleotide sequence identity, 
preferably over a region of over a region of at least about 25, 50, 100, 200, 500, 1000, or 
more nucleotides, to a nucleotide sequence of or associated with a unigene cluster of Tables 
1 A-16; (2) bind to antibodies, e.g., polyclonal antibodies, raised against an immunogen 

30 con:5)rising an amino acid sequence encoded by a nucleotide sequence of or associated witii a 
unigene cluster of Tables 1 A-16, and conservatively modified variants thereof (3) 
specifically hybridize under stringent hybridization conditions to a nucleic add sequence, or 
the conqilement thereof of Tables lA-16 and conservatively modified variants thereof; or (4) 

8 
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have an amino acid sequence that has greater than about 60% amino acid sequence identity, 

65%, 70%, 75%, 80%, 85%, 90%, preferably 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 

or 99% or greater amino sequence identity, preferably over a region of over a region of at 

least about 25, 50, 100, 200, 500, 1000, or more amino acid, to an amino acid sequence 

5 encoded by a nucleotide sequence of or associated with a unigene cluster of Tables 1 A-I6. A 

polynucleotide or polypeptide sequence is typically fix>m a mammal including, but not 

limited to, primate, e.g., human; rodent, e.g., rat, mouse, hamster, cow, pig, horse, sheep, or 

other mammal. A *lung cancer polypq>tide" and a *1ung cancer polynucleotide,'' include 

both naturally occurring or recombinant forms. 

10 A "full length" lung cancer protein or nucleic acid refers to a limg cancer polypeptide 

or polynucleotide sequence, or a variant thereof, that contains the elements normally 
contained in one or more naturally occurring, wild type lung cancer polynucleotide or 
polypeptide sequences. The "full length" may be prior to, or after, various stages of post- 
translational processing or splicing, including alternative splicing. 

15 '^Biological sample'' as used herein is a sample of biological tissue or fluid that 

contains nucleic acids or polypeptides, e.g., of a lung cancer protein, polynucleotide, or 
transcript Such samples include, but are not limited to, tissue isolated from primates, e.g., 
humans, or rodents, e.g., mice, and rats. Biological sanq)les may also include sections of 
tissues such as biopsy and autopsy samples, frozen sections taken for histologic purposes, 

20 archival materials, blood, plasma, serum, sputum, stool, tears, mucus, hair, skin, etc. 
Biological samples also include explants and primary and/or transformed cell cultures 
derived from patient tissues. A biological sample is typically obtained from a eukaryotic 
organism, most preferably a mammal such as a primate, e.g., chimpanzee or human; cow; 
dog; cat; a rodent, e.g., guinea pig, rat, mouse; rabbit; or other mammal; or a bird; reptile; 

25 fish. Livestock and domestic animals are of interest. 

"Providing a biological sample" means to obtain a biological sample for use hi 
methods described in this invention. Most often, this will be done by removing a sample of 
cells from an animal, but can also be accomplished by using previously isolated cells (e.g., 
isolated by another person, at another time, and/or for another purpose), or by performing the 

30 methods of the invention in vivo. Archival tissues or materials, haviog treatment or outcome 
history, will be particularly useful. 

The terms "identical" or percent "identity," in the context of two or more nucleic 
acids or polypeptide sequences, refer to two or more sequences or subsequences that are the 
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same or have a specified percentage of amino acid residues or nucleotides that are the same 
(e.g., about 60% identity, preferably 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 
95%, 96%, 97%, 98%, 99%, or higher identity over a specified region, when compared and 
aKgned for maximum correspondence over a comparison window or designated region) as 
5 measured usmg, e.g., a BLAST or BLAST 2.0 sequence comparison algorithms with default 
parameters described below, or by manual alignment and visual inspection (see, e.g., NCBI 
web site http://www.ncbi.nlm.nih.gov/BLAST/ or the like). Such sequences are then said to 
be "substantially identical." This definition also refers to, or may be ^plied to, the 
complement of a test sequence. The definition also includes sequences that have deletions 
10 and/or insertions, substitutions, and naturally occurring, e.g., polymorphic or allelic variants, 
and man-made variants. As described below, the preferred algorithms can account for gaps 
and the like. Preferably, identity exists over a region that is at least about 25 amino acids or 
nucleotides in length, or more preferably over a region that is 50-100 amino acids or 
nucleotides in length. 

15 For sequence comparison, typically one sequence acts as a reference sequence, to 

which test sequences are compared. When using a sequrace comparison algorithm, test and 
leference sequences are entered into a computer, subsequence coordinates are designated, if 
necessary, and sequence algorithm program parameters are designated. Preferably, default 
program parameters can be used, or alternative parameters can be designated. The sequence 

20 comparison algorithm then calculates the percent sequence identities for the test sequences 
relative to the reference sequence, based on the program parameters. 

A "comparison window", as used herein, includes reference to a segment of 
contiguous positions selected from the group consisting typically of from 20 to 600, usually 
about 50 to about 200, more usually about 100 to about 150 in which a sequence may be 

25 compared to a reference sequence of the same number of contiguous positions after the two 
sequences are optimally aligned. Methods of alignment of sequences for comparison are 
well-known in the art. Optimal ahgnment of sequences for comparison can be conducted, 
e.g., by the local homology algorithm of Smith and Waterman (1981) Adv. AppL Math. 
2:482, by the homology alignment algorithm of Needleman and Wunsch (1970) J. Mol. Biol 

30 48:443, by the search for similarity method of Pearson and Lipman (1988) Proc. Nat'L Acad. 
Sci. USA 85:2444, by computerized implementations of these algorithms (GAP, BESTFET, 
FASTA, and TFASTA in the Wisconsin Genetics Software Package, Genetics Computer 
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Group, 575 Science Dr., Madison, WT), or by manual alignment and visual inspection (see, 

e.g,, Ausubel, et al* (eds. 1995 and supplements) Current Protocols in Molecular Biology . 

Preferred examples of algorithms that are suitable for determining percent sequence 

identity and sequence similarity include the BLAST and BLAST 2.0 algorithms, which are 

5 described in Altschul, et al. (1977) Nuc. Acids Res. 25:3389-3402 and Altschul, et al. (1990) 

J. Mol. Biol. 215:403-410. BLAST and BLAST 2.0 are used, with the parameters described 

hereui, to determine percent sequence identity for the nucleic acids and proteins of the 

invention. Sofhvare for perfoniiing BLAST analyses is pubUcly available through 

National Center for Biotechnology Information (http://www.ncbi.nlm.nih.gov/). This 

10 algorithm involves first identifying high scoring sequence pairs (HSPs) by identifying short 
words of length W in the query sequence, which either match or satisfy some positive-valued 
threshold score T when aUgned with a word of the same length in a database sequence. T is 
referred to as the neigfhboihood word score threshold (Altschul, et al., stq>rd). These initial 
neighborhood word hits act as seeds for initiating searches to find longer HSPs containing 

15 them. The word hits are extended in both directions along each sequence for as &r as Ae 
cumulative alignment score can be increased. Cumulative scores are calculated using, e.g., 
for nucleotide sequences, the parameters M (reward score for a pair of matching residues; 
alwa}^ > 0) and N (penalty score for mismatching residues; always < 0). For amino acid 
sequences, a scoring matrix is used to calculate the cumulative score. Extension of flie word 

20 hits in each direction are halted when: the cumulative alignment score falls offby the 

quantity X fix>m its maximum achieved value; tiie cumulative score goes to zero or below, 
due to the accumulation of one or more negative-scoring residue alignments; or the end of 
either sequence is reached. The BLAST algorithm parameters W, T, and X determine the 
sensitivity and speed of the alignment. The BLASTN program (for nucleotide sequences) 

25 uses as defaults a wordlengfli (W) of 1 1, an expectation (E) of 10, M=5, N=^4 and a 
comparison of both strands. For amino acid sequences, the BLASTP program uses as 
defaults a wordlength of 3, and expectation (E) of 10, and the BLOSUM62 scoring matrix 
(seeHenikoff and Heniko£F(1989) Proc. Natl. Acad. Sci. USA 89:10915) alignments (B) of 
50, expectation (E) of 10, M=5, N=-4, and a comparison of both strands. 

30 The BLAST algorithm also performs a statistical analysis of the similarity between 

two sequences (see, e.g., Karlin and Altschul (1993) Proc. Nat'l. Acad. Sci. USA 90:5873- 
5787). One measure of similarity provided by the BLAST algorithm is the smallest sum 
probability (P(N))j which provides an indication of the probability by which a match between 
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two nucleotide or amino acid sequences would occur by chance. For example, a nucleic acid 
is considered similar to a reference sequence if the smallest sum probability in a comparison 
of the test nucleic acid to the reference nucleic acid is less than about 0.2, more preferably 
less than about 0.01, and most preferably less than about 0.001. Log values may be negative 
5 large numbers, e.g., 5, 10, 20, 30, 40, 40, 70, 90, 110, 150, 170, etc. 

An indication that two nucleic acid sequences are substantially identical is that the 
polypeptide encoded by the first nucleic acid is immunologically cross reactive with the 
antibodies raised against the polypeptide encoded by the second nucleic acid. Thus, a 
polypeptide is typically substantially identical to a second polypq)tide, e.g., where the two 
1 0 peptides differ only by conservative substitutions. Another indication that two nucleic acid 
sequences are substantially identical is that the two molecules or their complements hybridize 
to each other under stringent conditions. Yet another indication that two nucleic acid 
sequences are substantially identical is that the same primers can be used to ainplify the 
sequences. 

IS A 'liost cell'' is a naturally occurring cell or a transformed cell that contains an 

expression vector and supports the replication or expression of the expression vector. Host 
cells may be cultured cells, e3q)lants, cells in vivo^ and the like. Host cells may be 
prokaryotic cells such as E. colU or eukaryotic cells such as yeast, insect, amphibian, or 
tnammfllian cells such as CHO, HeLa, and tiie like (see, e.g., the American Type Culture 

20 Collection catalog or web site, www.atcc.org). 

The terms "isolated," •'purified," or 'Tjiologically pure" refer to material that is 
substantially or essentially fce& from components that normally accompany it as found in its 
native state. Purity and homogeneity are typically determined using analytical chemistry 
techniques such as polyacrylamide gel electrophoresis or high performance liquid 

25 chromatography. A protein or nucleic acid that is the predominant species present in a 

preparation is substantially purified. In particular, an isolated nucleic acid is separated from 
some open reading fi^mes that naturally flank the gene and encode proteins other than protein 
encoded by the gene. The term ''purified" in some embodiments denotes that a nucleic acid 
or protein gives rise to essentially one band in an electrophoretic gel. Preferably, it means 

30 that the nucleic acid or protein is at least about 85% pure, more preferably at least 95% pure, 
and most preferably at least 99% pure. 'Turify" or "purification" in other embodiments 
means removing at least one contaminant or component from the composition to b^ purified. 



12 



wo 02/086443 PCTAJS02/12476 

In this sense, purification does not require that the purified compound be homogeneous, e.g., 
100% pure. 

The terms "polypeptide," "peptide" and "protein" are used interchangeably herein to 
refer to a polymer of amino acid residues. The terms apply to amino acid polymers in which 

5 one or more amino acid residue is an artificial chemical mimetic of a corresponding naturally 
occurring amino acid, as well as to naturally occurring amino acid polymers, those contaiiring 
modified residues, and non-naturally occurring amino acid polymer. 

The term "amino acid" refers to naturally occurring and synflietic amino acids, as well 
as amino acid analogs and amino acid mimetics that function similarly to the naturally 

10 occurring amino adds. Naturally occurring amino adds are those encoded by the graetic 
code, as well as those amino acids that are later modified, e.g., hydroxyproline, 7- 
carboxyglutamate, and 0-phosphoserine. Amino acid analogs refer to compounds that have 
the same basic chemical structure as a naturally occurring amino acid, e.g., an a carbon that is 
bound to a hydrogen, a carboxyl groiq), an amino group, and an R groiq), e.g., homoserine, 

15 norleucine, methionine sulfoxide, me&ionhie methyl sulfoiiiun^ Such analogs may have 
modified R groiQ>s (e.g., norleucine) or modified peptide backbones, but retain some basic 
chemical structure as a naturally occurring amino acid. Amino add mimetics refer to 
chemical compomds that have a structure that is different fix)m the general chemical 
stmcture of an amino acid, but that function similarly to anotiier amino acid 

20 Amino acids may be referred to herein by dther thek commonly known three letter 

symbols or by the one-letter symbols recommended by the lUPAC-IUB Biochemical 
Nomenclature CommissioiL Nucleotides, Ukewise, may be referred to by their commonly 
accepted single-letter codes. 

"Conservatively modified variants" applies to both ammo acid and nucldc acid 

25 sequences. With respect to particular nucleic acid sequences, conservatively modified 
variants refers to those nucleic acids which encode identical or essentially identical amino 
acid sequences, or where the nucleic acid does not encode an amino acid sequence, to 
essentially identical or associated, e.g., naturally contiguous, sequences. Because of the 
degeneracy of the genetic code, a large nmnber of fimctionally identical nucleic acids encode 

30 most proteins. For instance, the codons GCA, GCC, GCG, and GCU each ^code the amino 
acid alanine. Thus, at each position where an alanine is specified by a codon, the codon can 
be altered to another of the corresponding codons described without altering the encoded 
polypeptide. Such nucleic acid variations are "silent variations," which are one species of 
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conservatively modified variations. Every nucleic acid sequence Herein which mcodes a 

polypeptide also describes silent variations of the nucleic acid. In certain contexts each 

codon in a nucleic acid (except AUG, which is ordinarily the only codon for methionine, and 

TGG, which is ordinarily the only codon for tryptophan) can be modified to yield a 

5 functionally similar molecule. Accordingly, a silent variation of a nucleic acid which 

encodes a polypeptide is implicit in a described sequence with respect to the expression 

product, but not necessarily with respect to actual probe sequences. 

As to amino acid sequences, one of skill will recognize that individual substitutions, 

deletions or additions to a nucleic acid, peptide, polypeptide, or protein sequence which 

10 alters, adds or deletes a single amino add or a small percentage of amino acids in the encoded 
sequence is a "'conservatively modified variant" where the alteration results in the substitution 
of an amino acid with a chemically similar amino acid. Conservative substitution tables 
providing functionally similar anuno acids are well known in the art. Such conservatively 
modified variants are in addition to and do not exclude polymorphic variants, interspecies 

1 5 homologs, and alleles of the invmtion. Typically conservative substitutions include for one 
another: 1) Alanine (A), Glycine (G); 2) Aspartic alcid (D), Glutamic add (E); 3) Asparagine 
(N), Glutamine (Q); 4) Arginine (R), Lysine (K); 5) Isoleucine (I), Leucine (L), Methionine 
(M), Valine (V); 6) Phenylalanine (F), Tyrosine (Y), Tryptophan (W); 7) Serine (S), 
Threonine (T); and 8) Cysteine (C), Methionine (M) (see, e.g., Crdghton, Proteins (1984)). 

20 Macromolecular structures such as polypq)tide structures can be described in terms of 

various levels of organization. For a general discussion of this organization, see, e.g., 
Alberts, et al. (1994) Molecular Biology of the Cell (3"^ ed.) and Cantor and Schimmel (1980) 
Biophysical Chemistry Part I: The Conformation of Biological Macromolecules . 'Trimary 
structure" refers to the amino acid sequence of a particular peptide. "Secondary structure" 

25 refers to locally ordered, three dimensional structures within a polypeptide. These structures 
are commonly known as domains. Domains are portions of a polypeptide that often form a 
compact unit of the polypeptide and are typically 25 to approximately 500 amino acids long. 
Typical domains are made up of sections of lesser organization such as stretches of p-sheet 
and a-helices. "Tertiary structure" refers to the complete three dimensional stmcture of a 

30 polypeptide monomer. "Quatemary structure" refers to the three dimensional structure 
formed, usually by the noncovalent association of independent tertiary units. Anisotropic 
terms are also known as energy terms. 
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'T^Jucleic acid" or "oligonucleotide" or "polynucleotide" or grammatical equivalents 

used herein means at least two nucleotides covalently linked together. Oligonucleotides are 

typically from about 5, 6, 7, 8, 9, 10, 12, 15, 25, 30, 40, 50 or more nucleotides in length, up 

to about 100 nucleotides in length. Nucleic acids and polynucleotides are a polymers of any 

lengfli, including longer lengths, e.g., 200, 300, 500, 1000, 2000, 3000, 5000, 7000, 10,000, 

etc. A nucleic acid of the present invention will generally contain phosphodiester bonds, 

although in some cases, nucleic acid analogs are included tiiat may have at least one different 

li^ge* 6-g-» phosphoramidate, phosphorothioate, phosphoroditUoate, or O- 

mefhylphophoroamidite linkages (see Eckstein (1992) Oligonucleo tides and Anal ppies: A 

Practical Approach Oxford University Press); and peptide nucleic acid backbones and 

linkages. Other analog nucleic acids include tiiose with positive backbones; non-ionic 

bacId>ones, and npn-ribose backbones, including those, described in U.S. Patent Nos. 

5,235,033 and 5,034,506, and Chapters 6 and 7, in Sanghui and Cook, eds. Carbohvdrate 

Modificatio ns in Antisense Research. ASC Synqiosium Series 580. Nucleic acids containing 

one or more carbocyclic sugars are also included within one definition of nucleic acids. 

Modifications of the ribose-phosphate backbone may be done for a variety of reasons, e.g., to 

increase the stability and half-life of such molecules in physiological environments or as 

probes on a biochip. Mixtures of naturally occurring nucleic adds and analogs can be made; 

alternatively, mixtures of different nucleic acid analogs, and mbctuies of naturally occurring 

nucleic acids and analogs may be made. 

Particularly preferred are peptide nucleic acids (PNA) which includes peptide nucleic 
acid analogs. These backbones are substantially non4onic under neutral conditions, in 
contrast to tiie highly charged phosphodiester backbone of naturally occurring nucleic acids. 
This results in two advantages. First, the PNA backbone exhibits improved hybridization 
kinetics. PNAs have larger changes in the melting temperature (TnO for mismatched versus 
perfectly matched basepahs. DNA and RNA typically &xMbit a 2-4° C drop in Tm for an 
internal mismatch. With the non-ionic PNA backbone, the drop is closer to 7-9° C. 
Similarly, due to their non-ionic nature, hybridization of the bases attached to these 
backbones is relatively insensitive to salt concentration. In addition, PNAs are not degraded 
by cellular en2ymes, and thus can be more stable. 

The nucleic acids may be single stranded or double stranded, as specified, or contain 
portions of both double stranded or single stranded sequence. As will be appreciated by those 
in the art, the depiction of a single strand also defines the sequence of the complementary 
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Strand; thus the sequences described herem also provide the complement of tiie sequence. 

The nucleic acid may be DNA, both genomic and cDNA, RNA, or a hybrid, where the 

nucleic acid may contain combinations of deoxyiibo- and ribo-nucleotides, and combinations 

of bases, including uracil, adenine, thymine, cytosine, guanme, inosine, xanthine 

5 hypoxanthine, isocytosine, isoguanine, etc. *Transcripf ' typically refers to a naturally 

occurring RNA, e.g., a pre-mRNA, hnRNA, or mRNA. As used herein, the term 

•^nucleoside" includes nucleotides and nucleoside and nucleotide analogs, and modified 

nucleoside such as amino modified nucleosides. In addition, "nucleoside" includes non- 

naturally occurring analog structures. Thus, e.g., the individual units of a peptide nucleic 

1 0 acid, each containing a base, are referred to herein as a nucleoside. 

A **label" or a "detectable moie^' is a composition detectable by spectroscopic, 
photochemical, biochemical, immunochemical, physiological, chemical, or other physical 
means. For exanq)le, useful labels include ^^P, fluorescent dyes, electron-dense reagents, 
enzymes (e.g., as commonly used in an ELISA), biotin, digoxigenin, or h^tens and proteins 

15 or other entities which can be made detectable, e.g., by incorporating a radiolabel into the 
peptide or used to detect antibodies specifically reactive with the peptide. The labels may be 
incorporated into the cancer nucleic acids, proteins, and antibodies. Many methods known in 
flie art for conjugating the antibody to the label may be employed, including those metiiods 
described by Hunter, et aL (1962) Nature 144:945; David, et al. (1974) Biocl^emistry 

20 13:1014-1021; Pain, et al. (1981) J. TmmunoL Meth., 40:219-230; and Nygren (1982) L 
ffistochem. and Cvtochem. 30:407-412. 

An "effector" or "effector moiet/* or "effector componenf ' is a molecule that is 
bound (or linked, or conjugated), either covalently, through a linker or a chemical bond, or 
noncovalently, through ionic, van der Waals, electrostatic, or hydrogen bonds, to an antibody. 

25 The "eflfector"' can be a variety of molecules including, e.g., detection moieties including 

radioactive compounds, fluorescent compounds, an enzyme or substrate, tags such as epitope 
tags, a toxin; activatable moieties, a chemotherapeutic agent; a lipase; an antibiotic; or a 
radioisotope emitting *liard" e.g., beta radiation. 

A "labeled nucleic acid probe or oUgonucleotide" is one that is bound,'either 

30 covalentiy, through a linker or a chemical bond, or noncovalentiy, through ionic, van der 
Waals, electrostatic, or hydrogen bonds to a label such that the presence of the probe may be 
detected by detecting the presence of the label bound to the probe. Alternatively, method 
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using high afBnity interactions may achieve the same results where one of a pair of binding 
partners binds to the other, e.g., biotin, strq)tavidiiL 

As used herein a '"nucleic acid probe or oligonucleotide" is a nucleic add enable of 
binding to a target nucleic acid of complementary sequence through one or more types of 
chemical bonds, usually tbroug]i complementary base pairing, e.g., tibrough hydrogen bond 
formation. As used herein, a probe may include natural (i.e., A, G, C, or T) or modified bases 
(7-deazaguanosine, inosine, etc.). In addition, the bases in a probe may be joined by a 
linkage other than a phosphodiester bond, preferably one that does not functionally interfere 
with hybridization. Thus, e.g., probes may be peptide nucleic acids in which the constituent 
bases are joined by peptide bonds rather than phosphodiester linkages. Probes may bind 
target sequences lacking complete complementarity with the probe sequence depending upon 
the stringency of the hybridization conditions. The probes are preferably directiy labeled, 
e.g., with isotopes, chromophores, lumiphores, chromogens, or indirectiy labeled, e.g., with 
biotin to which a streptavidin conq)lex may later bind. By assaying for the presence or 
absence of tiie probe, one can detect the presence or absence of the select sequence or 
subsequence. Diagnosis or prognosis may be based at the genomic level, or at tiie level of 
RNA or protein expression. 

The term **tecombiiiant" when used with reference, e.g., to a cell, or nucleic acid, 
protem, or vector, indicates that the cell, nucleic acid, protein or vector, has been modified by 
the introduction of a heterologous nucleic acid or protein or the alteration of a native nucleic 
acid or protem, or that the cell is derived from a cell so modified. Thus, e.g., recombinant 
ceUs express genes that are not found within the native (non-recombinant) form of the cell or 
express native genes fliat are otherwise abnormally expressed, under expressed or not 
expressed at all. By the term '"recombinant nucleic acid" herein is meant nucleic acid, 
originally formed in vitro^ in general, by the manipulation of nucleic acid, e.g., usiug 
polymerases and endonucleases, in a form not normally found in nature. In this manner, 
operably linkage of different sequences is achieved Thus an isolated nucleic acid, in a linear 
form, or an expression vector formed in vitro by Ugating DNA molecules that are not 
normally joined, are both considered recombinant for the purposes of this invention. It is 
understood that once a recombinant nucleic acid is made and reintroduced into a host cell or 
organism, it will replicate non-recombinantiy, i.e., using the i« vivo cellular machinery of the 
host cell rather than in vitro manipulations; however, such nucleic acids, once produced 
recombinantly, although subsequently repUcated non-recombinantiy, are still considered 
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recombinant for the purposes of the inventioiL Similarly, a "recombinant protein" is a protein 
made using recombinant techniques, i.e., through the e3q>ression of a recombmant nucleic 
acid as depicted above. 

The term 'lieterologous" when used with reference to portions of a nucleic acid 
5 indicates that the nucleic acid comprises two or more subsequences dmt are not normally 
found in the same relation^p to each other m nature. For instance, the nucleic acid is 
typically recombinantly produced, having two or more sequences, e.g., bom unrelated genes 
arranged to make a new functional nucleic acid, e.g., a promoter from one source and a 
coding region from another source. Similarly, a heterologous protein will often refer to two 
10 or more subsequences that are not found in the same relationship to each other in nature (e.g., 
a fusion protein). 

A '^moter" is typically an array of nucleic acid control sequences that direct 
transcription of a nucleic acid. As used herein, a promoter includes necessary nucleic acid 
sequences near the start site of transcription, such as, in the case of a polymerase n type 

15 promoter, a TATA element. A promoter also optionally includes distal enhancer or repressor 
elements, which can be located as much as several thousand base pairs from the start site of 
transcription. A '^constitutive** promoter is a promoter that is active under most 
environmental and developmental conditions. An 'inducible" promoter is a promoter that is 
active under environmental or developmental regulation. The term "operably linked" refers 

20 to a functional linkage between a nucleic acid expression control sequence (such as a 

promoter, or array of transcription factor binding sites) and a second nucleic acid sequence, 
e.g., wherein the expression control sequence directs transcription of the nucleic acid 
corresponding to the second sequence. 

An "expression vector" is a nucleic acid construct, generated recombinantly or 

25 synthetically, with a series of specified nucleic acid elements that permit transcription of a 
particular nucleic acid in a host cell. The expression vector can be part of a plasmid, virus, or 
nucleic acid fragment. Typically, the expression vector includes a nucleic acid to be 
transcribed in operable linkage to a promoter. 

The phrase "selectively (or specifically) hybridizes to" refers to the binding, 

30 duplexing, or hybridizing of a molecule selectively to a particular nucleotide sequence under 
stringent hybridization conditions when that sequence is present in a complex mixture (e.g., 
total cellular or library DNA or KNA). 
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The phrase "stringent hybridization conditions" refers to conditions under which a 
probe will hybridize to its target subsequence, typically in a complex mixture of nucleic 
acids, but to essentially no other sequences. Stringent conditions are sequence-dependent and 
will be different in difT^ent circumstances. Longer sequences hybridize specifically at 
S higher temperatures. An ext^isive guide to the hybridization of nucleic acids is found in 
"Overview of principles of hybridization and the strategy of nucleic acid assays" in Tijssen 
(1993) Techniques in Biochemistry and Molecular Biologv-Hvbridization with Nucleic 
Probes (vol. 24) Elsevier. Generally, stringent conditions are selected to be about 5-10** C 
lower than Ihe thermal melting point (TnO for the specific sequence at a defined ionic strength 

10 pH. The Tm is the temperature (under defined ionic strmgth, pH, and nucleic concentration) 
at which 50% of the probes complementary to the target hybridize to the target sequence at 
equilibrium (as the target sequences are present in excess, at Tm, 50% of the probes are 
occupied at equilibrium). Stringent conditions will be those in which the salt concentration is 
less than about 1.0 M sodium ion, typically about 0.01 to 1.0 M sodium ion concentration (or 

15 other salts) at pH 7.0 to 8.3 and the temperature is at least about 30° C for short probes (e.g., 
10 to SO nucleotides) and at least about 60° C for long probes (e.g., greater than 50 . 
nucleotides). Stringent conditions may also be achieved with the addition of destabilizing 
agents such as formamide. For selective or specific hybridization, a positive signal is 
typically at least two times background, preferably 10 times background hybridization. 

20 Exemplaiy stringent hybridization conditions are often: 50% formamide, 5x SSC, and 1% 
SDS, incubating at 42° C, or, 5x SSC, 1% SDS, incubating at 65° C, with wash in 0.2x SSC, 
and 0.1% SDS at 65° C. For PGR, a temperature of about 36"* C is typical for low stringency 
amphfication, although annealing temperatures may vary between about 32° C and 48® C 
depending on primer length. For high stringency PGR amplification, a temperature of about 

25 62° C is typical, although high stringency annealing temperatures can range fi*om about 50° C 
to about 65° C, depending on the primer length and specificity. Typical cycle conditions for 
both high and low stringency amplifications include a denaturation phase of 90° C - 95° C for 
0.5 - 2 min., an annealing phase lasting 0.5 - 2 min., and an extension phase of about 72° C 
for 1 - 2 min. Protocols and guidelines for low and high stringency amplification reactions 

30 are provided, e.g., in Imiis, et al.(1990) PCR Protocols. A Guide to Methods and 
Applications . 

Nucleic acids that do not hybridize to each other under stringent conditions are still 
substantially identical if the polypeptides which they encode are substantially identical. This 
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occurs, e.g., when a copy of a nucleic acid is created using the maximum codon degeneracy 

permitted by the genetic code. In such cases, the nucleic acids typically hybridize under 

moderately stringent hybridization conditions. Exemplary **moderately stringent 

hybridization conditions" include a hybridization in a buffer of 40% foraiamide, 1 M NaCl, 

5 1% SDS at 37"* C, and a wash in IX SSC at 45^ C. A positive hybridization is at least twice 

background. Alternative hybridization and wash conditions can be utilized to provide 

conditions of similar stringency. Additional guidelines for determining hybridization 

parameters are provided in numerous reference, e.g., Ausubel, et al. (ed.) Current Protocols in 

Molecular Biology Linpincott. 

10 The phrase "functional efBects" in the context of assays for testing compounds tiiat 

modulate activity of a lung cancer protein includes the determination of a parameter that is 
indirectly or directly under the influence of tiie lung cancer protein or nucleic add, e.g-, a 
physiological, enzymatic, functional, physical, or chemical effect, such as the ability to 
decrease lung cancer. It includes Ugand binding activity; cell viability, cell growth on soft 

1 S agar, anchorage dependence; contact inhibition and density limitation of growth; cellular 
proliferation; cellular transformation; growth factor or serum dependence; tumor specific 
maiker levels; invasiveness into Matrigel; tumor growth and metastasis in vipo; mRNA and 
protein e)qN:ession in cells undergoing metastasis, and oth^ characteristics of lung cancer 
cells. *Tunctional effects" include in vitro^ in vivo^ and ex ^nvo activities. 

20 By "determining the functional effecf ' is meant assaying for a compound that 

increases or decreases a parameter that is indirectly or directly under the influence of a lung 
cancer protein sequence, e.g., physiological, functional, enzymatic, physical, or chemical 
effects. Such functional effects can be measured by many means known to those skilled in 
the art, e.g., changes in spectroscopic characteristics (e.g., fluorescence, absorbance, 

25 refiractive index), hydrodynamic (e.g., shape), chromatogr^hic, or solubility properties for 
the protein, measuring inducible markers or transcriptional activation of the limg cancer 
protein; measuring binding activity or binding assays, e.g., binding to antibodies or other 
Ugands, and measuring cellular proliferation. Determination of the functional effect of a 
compound on lung cancer can also be performed using lung cancer assays known to those of 

30 skill in the art such as an in vitro assays, e.g., cell growth on soft agar; anchorage 

dependence; contact inhibition and density limitation of growth; cellular proUferation; 
cellular transformation; growth factor or serum dependence; tumor specific marker levels; 
invasiveness into Matrigel; tumor growth and metastasis in vivo; mRNA and protein 
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expression in cells undergoing metastasis, and other characteristics of lung cancer cells. The 
functional effects can be evaluated by many means known to those skilled in the art, e.g., 
microscopy for quantitative or qualitative measures of alterations m morphological features, 
measurement of changes in RNA or protein levels for lung cancer-associated sequences, 
5 measurement of KNA stability, identification of downstream or reporter gene expression 
. (CAT, luciferase, p-gal, GFP, and the like), e.g., via chemiluminescence, fluorescence, 
colorimetric reactions, antibody binding, inducible markers, and ligand binding assays. 

^Inhibitors", "activators", and ''modulators" of lung cancer polynucleotide and 
polypeptide sequences are used to refer to activating, iohibitory, or modulating molecules or 

10 con:q)ounds identified using in vitro and in vivo assays of lung cancer polynucleotide and 
polypeptide sequences. Inhibitors are compounds that, e.g., bind to, partially or totally block 
activity, decrease, prevent, delay activation, inactivate, desensitize, or down regulate the 
activity or ^ression of lung cancer proteins, e.g., antagonists. Antisense or inhibitory 
nucleic acids may seem to inhibit expression and subsequent function of the protein. 

15 "Activators" are compounds that increase, open, activate, facilitate, enhance activation, 
sensitize, agonize, or up regulate lung cancer protein activity. Inhibitors, activators, or 
modulators also include genetically modified versions of lung cancer proteins, e.g., versions 
with altered activity, as well as naturally occurring and synthetic ligands, antagonists, 
agonists, antibodies, small chemical molecules and the like. Such assays for inhibitors and 

20 activators include, e.g., expressing the lung cancer protein in vitro, in cells, or cell 

membranes, ^plying putative modulator compounds, and then determining the functional 
effects on activity, as described above. Activators and inhibitors of lung cancer can also be 
identified by incubating lung cancer cells with the test compound and determining increases 
or decreases in the expression of 1 or more lung cancer proteins, e.g., 1, 2, 3, 4, 5, 10, 15, 20, 

25 25, 30, 40, 50 or more lung cancer proteins, such as lung cancer proteins encoded by the 
sequences set out in Tables lA-16. 

Samples or assays comprising lung cancer proteins that are treated with a potential 
activator, inhibitor, or modulator are compared to control samples without the inhibitor, 
activator, or modulator to examine the extent of iohibition. Control samples (untreated with 

30 inhibitors) are assigned a relative protein activity value of 100%. Inhibition of a polypeptide 
is achieved when the activity value relative to the control is about 80%, preferably 50%, more 
preferably 25-0%. Activation of a lung cancer polypeptide is achieved when the activity 
value relative to the control (untreated with activators) is 1 10%, more preferably 150%, more 
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preferably 200-500% (i.e., two to five fold higher relative to the control), more preferably 
1000-3000% higher. 

The phrase "changes in cell growth'* refers to any change in cell growth and 
proIifCTation characteristics in vitro or in vivo, such as cell viability, formation of foci, 
5 anchorage independence, semi-solid or soft agar growth, changes in contact inhibition and 
density limitation of growth, loss of growfli factor or serum requirements, changes in cell 
morphology, gaining or losing immortalization, gaining or losing tumor specific markers, 
ability to form or siqjpress tumors when mjected into suitable animal hosts, and/or 
immortalization of the cell. See, e.g., Freshney (1994) rsiitiiTft nf AnTmal Cells a Manual of 

10 Basic Technique pp. 231-241 (3*^ ed,). 

'Tumor cell" refers to precancerous, cancerous, and normal cells in a tumor. 
"Cancer cells," "transformed" cells, or ^transformation" in tissue culture, refers to 
spontaneous or induced phenotypic changes that do not necessarily involve the uptake of new 
genetic material Although transformation can arise fiom infection with a transforming virus 

15 and incorporation of new genomic DNA, or uptake of exogenous DNA, it can also arise 

spontaneously or following e3q)osure to a carcinogen, thereby mutating an endogenous gene. 
Transformation is associated with phenotypic changes, such as immortalization of cells, 
aberrant growth control, nonmorphological changes, and/or malignancy (see, Freshney 
(1994) Culture of An imal Cells a Manual of Basic Technioue (3"^ ed.)). • 

20 "Antibody" refers to a polypeptide comprising a firamework region fi^om an 

immunoglobulin gene or fi-agments thereof that specifically binds and recognizes an antigen. 
The recognized immunoglobulin genes include the kappa, lambda, alpha, gamma, delta, 
epsilon, and mu constant region genes, as well as the myriad immunoglobulin variable region 
genes. Light chains are classified as either kappa or lambda. Heavy chains are classified as 

25 gamma, mu, alpha, delta, or epsilon, which in turn define the mununoglobulin classes, IgG, 
IgM, IgA, IgD, and IgE, respectively. Typically, the antigen-binding region of an antibody or 
its fimctional equivalent will be most critical in specificity and aflBnity of binding. See Paul, 
FiiTi damental Lnmunolofiv . 

An exemplary immunoglobulin (antibody) structural unit comprises a tetramer. Each 

30 tetramer is composed of two identical pairs of polypeptide chains, each pair having one 
"light" (about 25 kD) and one *lieavy" chain (about 50-70 kD). The N-terminus of each 
chain defines a variable region of about 100 to 110 or more amino acids primarily responsible 
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for antigen recognition. The tenns variable light chain (Vl) and variable heavy chain (Vh) 

refer to these light and heavy chains req)ectively. 

Antibodies exist, e.g., as intact inamunoglobulins or as a number of well-charactrnzed 

fragments produced by digestion with various peptidases. Thus, e.g., pqpsin digests an 

5 antibody below the disulfide linkages in the hinge region to produce F(ab)'2, a dimer of Fab 

which itself is a ligjit chain joined to Vh-Ch1 by a disulfide bond. The F(ab)'2 may be 

reduced under mild conditions to break the disulfide linkage in the hinge region, thereby 

converting the F(ab)*2 dimer into an Fab' monomer. The Fab' monomer is essentially Fab 

with part of the hinge region (see Paul (ed. 1999) Fundamental TmTnunnlop[y (4th ed.). While 

10 various antibody fragments are defined in terms of the digestion of an intact antibody, one of 
skill will appreciate that such fragments may be synthesized de novo either chemically or by 
using recombinant DNA methodology. Thus, the term antibody, as used herein, also includes 
antibody fragments either produced by the modification of whole antibodies, or those 
synthesized de novo using recombinant DNA methodologies (e.g., single chain Fv) or those 

1 5 identified using phage display libraries (see, e.g., McCafFerty, et al. (1990) Nature 348:552- 
554). 

For preparation of antibodies, e.g., recombinant, monoclonal, or polyclonal 
antibodies, many technique known in the art can be used (see, e.g., Kohler and Milstein 
(1975) Nature 256:495-497; Kozbor, et al. (1983) Immunologv Today 4:72; Cole, et al. 

20 (1985), pp. 77-96 in Monoclonal Antibodies and Cancer Therapy: Coligan (1991 and 
supplements) Current Protocols in Immunology: Harlow and Lane (1988) Antibodies, A 
Laboratory Manual: and Coding (1986) Monoclonal Antibodies: Principles and Practice (2d 
ed.)). Techniques for the production of single chain antibodies (U.S. Patent 4,946,778) can 
be adapted to produce antibodies to polypeptides of this invention. Also, transgenic mice, or 

25 other organisms such as other mammals, may be used to express humanized antibodies. 
Alternatively, phage display technology can be used to identify antibodies and heteromeric 
Fab fragments that specifically bind to selected antigens (see, e.g., McCafferty, et al. (1990) 
Nature 348:552-554; Marks, et al. (1992) Biotechnology 10:779-783). 

A "chimeric antibody*' is an antibody molecule in which, e.g, (a) the constant region, 

30 or a portion thereof, is altered, replaced, or exchanged so that the antigen binding site 
(variable region) is linked to a constant region of a different or altered class, effector 
function, and/or species, or an entirely different molecule which confers new properties to the 
chimeric antibody, e.g., an enzyme, toxin, hormone, growth factor, drug, etc.; or (b) the 
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variable region, or a portion thereof, is altered, replaced, or exchanged with a variable region 
having a different or altered antigen specificity. 

Identification of Inng cancer-associated sequences 

5 In one aspect, the expression levels of genes are determined in different patient 

samples for which diagnosis infonnation is desired, to provide e)q)ression profiles. An 
expression profile of a particular sample is essentially a "fingerprinf of the state of tiie 
sample; while two states may have any particular gene similarly expressed, the evaluation of 
a number of genes simultaneously allows the generation of a gene egression profile that is 

10 characteristic of the state of the cell. That is, normal tissue may be distinguished firom 
cancerous or metastatic cancerous tissue, or metastatic cancerous tissue can be coiiq>ared 
with tissue from surviving cancer patients. By comparing e3q>ression profiles of tissue in 
known different lung cancer states, infonnation regarding which genes are important 
(including bolh vp- and down-regulation of genes) in each of these states is obtained. 

1 5 Molecular profiling may distinguish subtypes of a currently collective disease designation, 
e.g., different forms of lung cancer (chronic disease, adenocarcinoma, etc.) 

The identification of sequences that are differentially expressed in lung cancer versus 
non-lung cancer tissue allows the use of this information in a number of ways. For example, 
a particular treatment regime may be evaluated: does a chemotherapeutic drug act to down- 

20 regulate lung cancer, and thus tumor growfli or recurrence, in a particular patient. 

Alternatively, a treatment step may induce other markers which may be used as targets to 
destroy tumor cells. Similarly, diagnosis and treatment outcomes may be done or confirmed 
by comparing patient samples with the known expression profiles. Malignant diseasemay be 
compared to non-malignant conditions. Metastatic tissue can also be analyzed to determine 

25 the stage of lung cancer in the tissue, or origin of primary tumor, e.g., metastasis firom a 

remote primary site. Furthermore, these gene expression profiles (or individual genes) allow 
screening of drug candidates with an eye to minwcking or altering a particular e^qiression 
profile; e.g., screening can be done for drugs that suppress the lung cancer expression profile. 
This may be done by making biochips comprising sets of the important lung cancer genes, 

30 which can then be used in these screens. PGR methods may be applied with selected prima: 
pairs, and analysis may be of RNA or of genomic sequences. These methods can also be 
done on the protein basis; that is, protein e;q>ression levels of the lung cancer proteins can be 
evaluated for diagnostic purposes or to screen candidate agents. In addition, the lung cancer 
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nucleic acid sequences can be administered for gene therapy purposes, including the 
administration of antisense nucleic acids, or the lung cancer proteins (including antibodies 
and other modulators fh^eof) administered as therapeutic drugs or as proteia or DNA 
vaccines. 

5 Thus the present invention provides nucleic acid and protein sequences that are 

differentially expressed in lung cancer relative to normal tissues and/or non-malignant lung 
disease, or in different types of lung disease, herein termed '*lung cancer sequences." As 
outlined below, lung cancer sequences include those that are up-regulated (i.e., expressed at a 
higher level) m lung cancer, as well as those that are down-regulated (i.e., expressed at a 

10 lower level). In a preferred embodiment, the lung cancer sequences are fix>m humans; 
however, as will be appreciated by those in tiie art, lung cancer sequences from other 
organisms may be useful in animal models of disease and drug evaluation; thus, o&er lung 
cancer sequences are provided, from vertebrates, including mammals, includmg rodents (rats, 
mice, hamsters, guinea pigs, etc.), primates, &rm animals (including sheep, goats, pigs, cows, 

IS horses, etc.) and pets (dogs, cats, etc.). Lung cancer sequ^ces fiom other organisms inay be 
obtained using the techniques outlined below. 

Lung cancer sequences can include both nucleic acid and amino acid sequences. As 
will be appreciated by those in the art and is more fidly outlined below, limg cancer nucleic 
acid sequences axe useful in a variety of applications, mcluding diagnostic ^plications, 

20 which will detect naturally occurring nucleic acids, as well as screening applications; e.g., 
biochips comprising nucleic acid probes or PGR microtiter plates with selected probes to the 
lung cancer sequences can be generated. 

A lung cancer sequence can be initially identified by substantial nucleic acid and/or 
amino acid sequence homology to the lung cancer sequences outlined herein. Such 

25 homology can be based upon the overall nucleic acid or amino acid sequence, and is 

generally determined as outlined below, e.g., using homology programs or hybridization 
conditions. 

For identifying lung cancer-associated sequences, the Ixmg cancer screen typically 
includes comparing genes identified in different tissues, e.g., normal and cancerous tissues, 
30 cancer and non-malignant conditions, non-malignant conditions and normal tissues, or tumor 
tissue samples firom patients who have metastatic disease vs. non metastatic tissue. Other 
suitable tissue comparisons include comparing lung cancer samples with metastatic cancer 
samples fi'om other cancers, such as, breast, other gastrointestinal cancers, prostate, ovarian. 
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etc. Samples of, non metastatic disease tissue and tissue undergoing metastasis are applied to 

biochips comprising nucleic acid probes. The samples are first microdissected, if applicable^ 
and treated as is known in the art for the preparation of mRNA. Suitable biochips are 
commercially available, e.g., from Affymetrix, Santa Clara, CA. Gene expression profiles as 
S described herein are generated and the data analyzed 

In one embodiment, the genes showing changes in e?q)ression as between normal and 
disease states are compared to genes expressed in other normal tissues, preferably normal 
lung, but also including, and not limited to colon, heart, brain, liver, breast, kidney, muscle, 
prostate, small intestine, large intestine, spleen, bone, and/or placenta. In a preferred 
1 0 embodiment, those genes identified during the lung cancer screen that are ejqnressed in 
significant amounts in other tissues (e.g., essential organs) are removed from tiie profile, 
althougji in some embodiments, this is not necessary (e.g., where organs may be dispensible 
at a later stage of life). That is, when screening for drugs, it is usually preferable that the 
target e>q>ression be disease specific, to minimize possible side effects on other organs. 
15 In a preferred embodiment, lung cancer sequ^ces are those that are up-regulated in 

lung cancer; that is, the expression of these genes is higher in cancerous tissue than m normal 
lung or other tissue. **Up-regulation*' as used hercm means, when the ratio is presented as a 
number greater than one, that tibie ratio is greater than one, preferably 1 .5 or greater, more 
preferably 2.0 or greater. Another embodiment is directed to sequences up-regulated in non- 
20 malignant conditions relative to normal. Unigene cluster identification numbers and 

accession numbers herein are for the GenBank sequence database and the sequences of the 
accession numbers are hereby expressly incorporated by reference. G enB a nk is known in the 
art, see, e.g., Benson, DA, et al (1998) Nucleic Acids Research 26:1-7 and 
http://www.ncbi.nlm.nih.gov/. Sequences are also available in other databases, e.g., 
25 European Molecular Biology Laboratory (EMBL) and DNA Database of Japan (DDBJ). 
Another embodiment is directed to sequences up-regulated in non-malignant conditions 
relative to normal. In some situations, the sequences may be derived from assembly of 
available sequences or be predicted from genomic DNA using exon prediction algorithms, 
such as FGENESH (Salamov and Solovyev (2000) Genome Res. 10:516-522). In other 
30 situations, sequences have been derived from cloning and sequencing of isolated nucleic 
acids. 

In another preferred embodiment, lung cancer sequences are those that are dojvn- 
regulated in the lung canc^, that is, the expression of these genes is lower in cancerous tissue 
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or normal lung or other tissue. 'TJown-regulation*' as used herein means, when the ratio is 
presented as a nxunber greater than one, that the ratio is greater than one, preferably L5 or 
greater, more preferably 2.0 or greater, or, when the ratio is presented as a number less than 
one, that the ratio is less than one, preferably 0.5 or less, more preferably 0.25 or less. 

'5 -■ . 

Informatics 

The ability to identify genes ttiat are over or under expressed in lung cancer can 
additionally provide high-resolution, high-sensitivity datasets which can be used in the areas 
of diagnostics, therapeutics, drug development, pharmacogenetics, protein structure, 

10 biosensor development, and other related areas. For example, the expression profiles can be 
used in diagnostic or prognostic evaluation of patients with lung cancer. Or as another 
example, subcellular toxicological information can be generated to better direct drug structure 
and activity correlation (see Anderson (1998) Pharmaceutical Proteomics: Tareets, 
Mfirhanism, and Function, paper presented at the IBC Proteomdcs conference, Coronado, CA 

15 (June 1 1-12, 1998)). Subcellular toxicological information can also be utilized in a biological 
sensor device to predict the likely toxicological effect of chemical exposures and likely 
tolerable exposure thresholds (see U.S. Patent No. 5,81 1,231). Similar advantages accrue 
from datasets relevant to otiier biomolecules and bioactive agents (e.g., nucleic acids, 
saccharides, lipids, drugs, and the like). 

20 Thus, in another embodiment, the present invention provides a database that includes 

at least one set of assay data. The data contained in the database is acquired, e.g., using array 
analysis either singly or in a library format. The database can be in a form in which data can 
be maintained and transtnitted, but is preferably an electronic database. The electronic 
database of the invention can be maintained on any electronic device allowing for the storage 

25 of and access to the database, such as a personal computer, but is preferably distributed on a 
wide area network, such as the World Wide Web, 

The focus of the present section on databases that include peptide sequence data is for 
clarity of illustration only. It will be apparent to those of skill in the art that similar databases 
can be assembled for assay data acquired using an assay of the invention. 

30 The compositions and methods for identifying and/or quantitating the relative and/or 

absolute abundance of a variety of molecular and macromolecular species firom a biological 
sample representing lung cancer, i.e., the identification of lung cancer-associated sequences 
described h^ein, provide an abundance of information, which can be correlated with 

27 



wo 02/086443 PCTAJSq2/12476 

pathological conditions, predisposition to disease, drug testing, therapeutic monitoring, gene- 
disease causal linkages, identification of correlates of immunity and physiological status, 
among others. Although the data generated from the assays of the invention is suited for 
manual review and analysis, in a preferred embodiment, data processing using high-speed 
5 computers is utilized. 

An array of methods for indexiag and retrieving biomolecular information is known 
in the art. For example, U.S. Patents 6,023,659 and 5,966,712 disclose a relational database 
system for storing biomolecular sequence information in a manner that allows sequences to 
be catalogued and searched according to one or more protein function hierarchies. U.S. 

1 0 Patent 5,953,727 discloses a relational database having sequence records containing 
information in a format that allows a collection of partial-length DNA sequences to be 
catalogued and searched according to association with one or more sequencing projects for 
obtaining full-length sequences firam the collection of partial length sequences. U.S. Patent 
5,706,498 discloses a gene database retrieval system for making a retrieval of a gene 

15 sequence similar to a sequence data item in a gene database based on the degree of similarity 
between a key sequence and a target sequence. U.S. Patent 5,538,897 discloses a method 
using mass spectix>scopy fragmentation patterns of peptides to identify amino acid sequences 
in computer databases by comparison of predicted mass spectra with experimentally-derived 
mass spectra using a closeness-of-fit measure. U.S. Patent 5,926,818 discloses a miilti- 

20 dimensional database comprising a functionality for multi-dimensional data analysis 
described as on-line analytical processmg (OLAP), which entails the consolidation of 
projected and actual data according to more than one consolidation path or dimension. U.S. 
Patent 5,295,261 reports a hybrid database structure in which the fields of each database 
record are divided into two classes, navigational and mfonnational data, with navigational 

25 fields stored in a hierarchical topological map which can be viewed as a tree structure or as 
the merger of two or more such tree structures. 

See also Mount, et al. (2001) Bioinformatics; Durbin, et al. (eds., 1999) Biolopcal 
Sequence Analvsis: Probabilistic Models of Proteins and Nucleic Acids (; Baxevanis and 
Oeullette (eds., 1998) Bioinformatics: A Practical Guide to th e Analvsis of Genes and 

30 Proteins) : Rashidi and Buehler (1 999) Bioinformatics: Basic A pplications in Biological 
Science and Medicine : Setubal, et al. (eds 1997) Introduction to Computa tional Molecular 
Biology : Misener and Krawetz (eds, 2000) Bioinformatics: Method s and Protocols: Higgins 
and Taylor (eds., 2000) Bioinformatic s: Sequence. Structure, and Databanks: A Praptical 
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Approach: Brown (2001) Bioinfonnatics: A Biologist's Guide to Biocomputing and the 

Internet : Han and Kamber (2000) Data Mining: Concepts and Techniques (2000); and 

Waterman (1995) Introduction to Computational Biology: Maps. Sequences, and Genomes . 

The present invention provides a computer database comprising a computer and 

5 software for storing in contputer-retrievable foim assay data records cross-tabulated, e.g., 

with data specifying the source of the target-containing sample fiom which each sequence 

specificity record was obtained. 

In an exemplary embodiment, at least one of the sources of target-containing sample 

is from a control tissue sample known to be firee of pathological disorders. In a variation, at 

10 least one of the sources is a known pa&ological tissue specimen, e.g., a neoplastic lesion or 
another tissue specimen to be analyzed for lung cancer. In another variation, the assay 
records cross-tabulate one or more of the following parameters for each target species in a 
sample: (1) a unique identification code, which can include, e.g., a target molecular structure 
and/or characteristic separation coordinate (e.g., electrophoretic coordinates); (2) san:q)le 

15 source; and (3) absolute and/or relative quantity of the target species present in the sample. 

The invention also provides for the storage and retrieval of a collection of target data 
in a compute data storage apparatus, which can include magnetic disks, optical disks, 
magneto-optical disks, DRAM, SRAM, SGRAM, SDRAM, RDRAM, DDR RAM, magnetic 
bubble memory devices, and other data storage devices, including CPU registers and on-CPU 

20 data storage arrays. Typically, the target data records are stored as a bit pattern in an array of 
magnetic domains on a magnetizable medium or as an array of charge states or transistor gate 
states, such as an array of cells in a DRAM device (e.g., each cell comprised of a transistor 
and a charge storage area, which may be on the transistor). In one embodiment, the invention 
provides such stomge devices, and computer systems built therewith, comprising a bit pattern 

25 encoding a protein expression fingerprint record comprising unique identifiers for at least 10 
target data records cross-tabulated with target source. 

When the target is a peptide or nucleic acid, the invention preferably provides a 
method for identifying related peptide or nucleic acid sequences, comprising performing a 
computerized comparison between a peptide or nucleic acid sequence assay record stored in 

30 or retrieved fi-om a computer storage device or database and at least one other sequence. The 
comparison can include a sequence analysis or comparison algorithm or computer program 
embodiment tiiereof (e.g., FASTA, TFASTA, GAP, BESTFIT) and/or tiie comparison may 
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be of the relative amount of a peptide or nucleic acid sequence in a pool of sequences 

detemiined from a polypeptide or nucleic acid sample of a specimen. 

The invention also preferably provides a magnetic disk, such as an IBM-compatible 

(DOS, Windows, Windows95/98/2000, Windows NT, OS/2) or other format (e.g., Linux, 

5 SunOS, Solaris, ADC, SCO Unix, VMS, MV, Macintosh, etc.) floppy diskette or hard (fixed, 

Winchester) disk drive, comprising a bit pattem encoding data from an assay of the invention 

in a file format suitable for retrieval and processing in a computerized sequence analysis, 

comparison, or relative quantitation method. 

The invention also provides a network, comprising a plurality of computing devices 

10 linked via a data link, such as an Ethernet cable (coax or lOBaseT), telephone line, ISDN 

line, wireless network, optical fiber, or other suitable signal transmission medium, whereby at 
least one netwoik device (e.g., computer, disk array, etc.) comprises a pattem of magnetic 
domains (e.g., magnetic disk) and/or charge domains (e.g., an array of DRAM cells) 
con^posing a bit pattem encoding data acquired &om an assay of the invention. 

15 The invention also provides a method for transmitting assay data that includes 

generating an electronic signal on an electronic communications device, such as a modem, 
ISDN terminal adapter, DSL, cable modem, ATM switch, or the like, wherein the signal 
includes (in native or encrypted format) a bit pattem encoding data from an assay or a 
database comprising a pluraUty of assay results obtained by the method of the inventioiL 

20 In a preferred embodiment, the invention provides a computer system for comparing a 

query target to a database containing an array of data stractures, such as an assay result 
obtained by the method of the invention, and ranking database targets based on the degree of 
idaitity and gap weight to the target data. A central processor is preferably initialized to load 
and execute the computer program for aUgimient and/or comparison of the assay results. 

25 Data for a query target is entered into the central processor via an I/O device. Execution of 
the computer program results in the central processor retrieving the assay data from the data 
file, which comprises a binary description of an assay result 

The target data or record and the computer program can be transferred to secondary 
memory, which is typically random access memory (e.g., DRAM, SRAM, SGRAM, or 

30 SDRAM). Targets are ranked according to the degree of correspondence between a selected 
assay characteristic (e.g., binding to a selected afiSnity moiety) and the same characteristic of 
the query target and results are output via an I/O device. For example, a central processor 
can be a conventional computer (e.g., Intel Pentium, PowerPC, Alpha, PA-8000, SPARC, 
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MIPS 4400, MIPS 10000, VAX, etc.); a program can be a commercial or public domain 
molecular biology software package (e.g., UWGCG Sequence Analysis Software, Darwin); a 
data file can be an optical or magnetic disk, a data server, a memory device (e.g., DRAM, 
SRAM, SGRAM, SDRAM, EPROM, bubble memory, flash memory, etc.); an I/O device cau 
be a terminal comprising a video display and a keyboard, a modem, an ISDN terminal 
adapter, an Ethernet port, a punched card reader, a magnetic strip reader, or other suitable I/O 
device. 

The invention also preferably provides the use of a computer system, such as that 
described above, which coittprises: (1) a conq)uter; (2) a stored bit pattern encoding a 
collection of peptide sequence specificity records obtained by the methods of the invention, 
which may be stored in the computer; (3) a comparison target, such as a query target; and (4) 
a program for aligmnent and comparison, typically with rank-ordering of comparison results 
on the basis of computed similarity values. 

Characteristics of lung cancer-associated proteins 

Lung cancer proteins of the present invention may be classified as secreted proteins, 
transmembrane proteins or intracellular proteins. In one embodiment, the lung cancer protein 
is an intracellular protein. Intracellular proteins may be found in the cytoplasm and/or in the 
nucleus. Intracellular proteins are involved in all aspects of cellular fimction and replication 
(including, e.g., signaling pathways); aberrant e^qiression of such proteins often results in 
unregulated or disregulated cellular processes (see, e.g., Alberts (ed.. 1994) Molecular 
Biology of the Cell (3d ed.). For example, many intracellular proteins have enzymatic 
activity such as protein kmase activity, protein phosphatase activity, protease activity, 
nucleotide cyclase activity, polymerase activity and the like. Intracellular proteins also serve 
as docking proteuis that are involved in organizing complexes of proteins, or targeting 
proteins to various subcellular localizations, and are involved in maintaining the structural 
integrity of organelles. 

An increasiagly appreciated concept in characterizing proteins is the presence in the 
proteins of one or more structural motifs for which defined fimctions have been attributed. In 
addition to the highly conserved sequences found in the enzymatic domain of proteins, highly 
conserved sequences have been identified in proteins that are involved in protem-protein 
interaction. For example, Src-homology-2 (SEE) domains bind tyrosine-phosphorylated 
targets in a sequence dependent manner. PTB domains, which are distinct from SH2 
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domains, also bind tyrosine phosphorylated targets. SH3 domains bind to proline-rich 

targets. In addition, PH domains, tetratricopeptide repeats and WD domains to name only a 

few, have been shown to mediate protein-protein interactions. Some of these may also be 

involved in binding to phospholipids or other second messengers. As will be appreciated by 

S one of ordinary skill in the art, these motifs can be identified on the basis of amino acid 

sequence; titius, an analysis of the sequence of proteins may provide insight into both the 

enzymatic potential of the molecule and/or molecules with which the protein may associate. 

One useful database is Pifem (protein families), which is a large collection of multiple 

sequence aligmnents and hidden Markov models covering many conmion protdn domains. 

10 Versions are available via tiie internet from Washington University in St Louis, flhie Sanger 

Center in Eiigland, and the Karolinska Institute in Sweden (see, e.g., Bateman, et al (2000) 

Nuc. Acids Res. 28:263-266; Sonnhammer, et al. (1997) Proteins 28:405-420; Bateman, et al. 

(1999) Nuc, Adds Res. 27:260-262; and Somihammer, et al. (1998) Acids Res, 26:320- 

322). 

15 In another embodiment, the lung cancer sequences are transmembrane proteins. 

Transmembrane proteins are molecules that span a phospholipid bilayer of a cell. They may 
have an intracellular domain, an extracellular domain, or both. The iutracellular domains of 
such proteins may have a number of functions including those already described for 
intracellular protems. For ^cample, the intracellular domain may have enzymatic activity 

20 and/or may serve as a blading site for additional proteins. Frequently the intracellular 

domain of transmembrane proteins serves both roles. For example certain receptor tyrosine 
kinases have both protein kinase activity and SH2 domains. In addition, autophosphorylation 
of tyrosines on the receptor molecule itself creates binding sites for additional SH2 domain 
containing proteins. 

25 Transmembrane proteins may contain from one to many transmembrane domains. 

For example, receptor tyrosine kinases, certain cytokine receptors, receptor guanylyl cyclases 
and receptor serine/threonine protein kinases contain a single transmembrane domain. 
However, various other proteins including channels, pumps, and adenylyl cyclases contain 
numerous transmembrane domains. Many important cell surface receptors such as G protein 

30 coupled receptors (GPCRs) are classified as "seven transmembrane domain" protems, as they 
contain 7 membrane spanning regions. Characteristics of transmembrane domains include 
approximately 17 consecutive hydrophobic amino acids that may be followed by charged 
amino acids. Therefore, upon analysis of the amino acid sequence of a particular protein, the 
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localization and number of transmembrane domains within the protein may be predicted (see, 
e.g., PSORT web site http://psorLnibb.acjp/). 

The extracellular domains of transmembrane proteins are diverse; however, conserved 
motifs are found repeatedly among various extracellular domains. Conserved structure 
5 and/or functions have been ascribed to different extracellular moti&. Many extracellular 
domains are involved in binding to other molecules. In one aspect, extracellular domains are 
found on receptors. Factors that bmd the receptor domain include circulating ligands, which 
may be peptides, proteins,, or small molecules such as adenosine and the like. For example, 
growth factors such as EGF, FGF, and PDGF are circulating growth factors that bind to their 

10 cognate receptors to initiate a variety of cellular responses. Other factors include cytokines, 
mitogenic factors, hormones, neurotrophic factors and the like. Extracellular domains also 
bind to cell-associated molecules. In this respect, they may mediate cell-cell interactions. 
Cell-associated ligands can be tethered to the cell, e.g., via a glycosylphosphatidyhnositol 
(GPI) anchor, or may themselves be transmembrane proteins. :&ctracellular domains may 

IS also associate with the extracellular matrix and contribute to the maintenance of the cell 
structure. 

Lung cancer proteins that are transmembrane are particularly preferred in the present 
invention as they are readily accessible targets for extracellular immuno&erapeutics, as are 
described herein. In addition, as outlined below, transmembrane proteins can be also useful 

20 in imaging modalities. Antibodies may be used to label such readily accessible proteins in 
situ or in histological analysis. Alternatively, antibodies can also label intracellular proteins, 
in which case analytical samples are typically penneablized to provide access to intracellular 
proteins. In addition, some membrane proteins can be processed to release a soluble protein, 
or to eiqpose a residual fragment. Released soluble proteins may be useful diagnostic 

25 markers, processed residual protein fragments may be useful lung markers of disease. 

It will also be appreciated by those in the art that a transmembrane protein can be 
made soluble by removing transmembrane sequences, e.g., through recombinant methods. 
Furthermore, transmembrane proteins that have been made soluble can be made to be 
secreted through recombinant means by adding an appropriate signal sequence. 

30 In another embodiment, the lung cancer proteins are secreted proteins; the secretion of 

which can be either constitutive or regulated. These proteins may have a signal peptide or 
signal sequence that targets the molecule to the secretory pathway. Secreted proteins are 
involved in numerous physiological events; e.g., if circulating, they often serve to transmit 
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signals to various other cell types. The secreted protein may function in an autocrine manner 
(acting on the cell that secreted the factor), a paracrine manner (acting on cells in close 
proximity to the cell that secreted the factor), an endocrine manner (acting on cells at a 
distance, e.g., secretion into the blood stream), or exocrine (secretion, e.g., through a duct or 
5 to adjacent epithelial sur&ce as sweat glands, sebaceous glands, pancreatic ducts, laoimal 
glands, mammary glands, sax producing glands of the ear, etc.). Thus secreted molecules 
often find use in modulating or altering numerous aspects of physiology. Limg cancer 
proteins that are secreted proteins are particularly preferred in the present invention as tiiey 
serve as good targets for diagnostic markers, e.g., for blood, plasma, serum, or stool tests. 
10 Those which are enzymes may be antibody or small molecule targets. Others may be useful 
as vaccine targets, e.g., via CTL mechanisms. 

Use of lung cancer nucleic acids 

As described above, lung cancer sequence is initially identified by substantial nucleic 
IS acid and/or amino acid sequence homology or linkage to the lung cancer sequences outlined 
herein. Such homology can be based upon the overall nucleic acid or amino acid sequence, 
and is generally determined as outlined below, using either homology programs or 
hybridization conditions. Typically, linked sequences on a mRNA are found on the same 
molecule. 

20 The lung cancer nucleic acid sequences of the invention, e.g., the sequences in Tables 

lA-16, can be fi-agments of larger genes, i.e., they are nucleic acid segments. "Genes" in this 
context includes coding regions, non-coding regions, and mixtures of coding and non-coding 
regions. Accordingly, as will be appreciated by those in the art, using the sequences provided 
herein, extended sequences, in either direction, of the lung cancer genes can be obtained, 

25 using techniques well known in tte art for cloning either longer sequences or the fiill length 
sequences; see Ausubel, et al., supra. Much can be done by informatics and many sequences 
can be clustered to include multiple sequences corresponding to a single gene, e.g., systems 
such as UniGene (see, http://www.ncbi.iilm.nih.gov/UniGene/). 

Once a lung cancer nucleic acid is identified, it can be cloned and, if necessary, its 

30 constituent parts recombmed to form the entire lung cancer nucleic acid coding regions or the 
entire mRNA sequence. Once isolated firom its natural source, e.g., contained within a 
plasmid or otho" vector or excised therefirom as a linear nucleic acid segment, the 
recombinant lung cancer nucleic acid can be further-used as a probe to identify and isolate 
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Other lung cancer nucleic acids, e.g., extended coding regions. It can also be used as a 

'precursor" nucleic acid to make modiJSed or variant lung cancer nucleic acids and proteins. 

The lung cancer nucleic acids of the present invention are used in several ways. In a 

first embodiment, nucleic acid probes to the lung cancer nucleic acids are made and attached 

5 to biochips to be used in screening and diagnostic methods, as outlined below, or for 

administration, e.g., for gene therapy, RNAi, vaccine, and/or antisense ^plications. 

Alternatively, the lung cancer nucleic adds that include coding regions of lung cancer 

proteins can be put mto expression vectors for the e3^ression of lung cancer proteins, again 

for screening purposes or for administration to a patient 

10 In a preferred embodiment, nucleic acid probes to lung cancer nucleic acids (both the 

nucleic add sequences outlined in the figures and/or the complements thereof) are made. 
The nucleic add probes attached to the biochip are designed to be substantially 
complementary to the lung cancer nucldc acids, i.e., the target sequence (either the target 
sequence of the sample or to other probe sequences, e.g., in sandwich assays), such that 

1 5 hybridization of the target sequence and ttie probes of the present invention occurs. As 

outlined below, this complementarity need not be perfect; there may be any number of base 
pair mismatches which will interfere with hybridization between the target sequence and the 
single stranded nucleic acids of the present invention. However, if the number of mutations 
is so great that no hybridization can occur under even the least stringent of hybridization 

20 conditions, the sequence is not a complementary target sequence. Thus, by ''substantially 
complementary" herein is meant that the probes are sufSdently complementary to the target 
sequences to hybridize under appropriate reaction conditions, particularly high stringency 
conditions, as outlined herein. 

A nucleic acid probe is generally single stranded but can be partially single and 

25 partially double stranded. The strandedness of the probe is dictated by the structure, 

composition, and properties of the target sequence. In general, the nucleic add probes range 
firom about 8 to about 100 bases long, with firom about 10 to about 80 bases being preferred, 
and firom about 30 to about 50 bases being particularly preferred. That is, generally 
complements of ORFs or whole genes are not used. In some embodiments, nucleic acids of 

30 lengths up to hundreds of bases can be used. 

La a preferred embodiment, more than one probe per sequence is used, with either 
overlapping probes or probes to different sections of the target being used. That is, two, 
three, four or more probes, with three being preferred, are used to build in a redimdancy for a 
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particular target. The probes can be overlapping (i.e., have some sequence in common), or 
separate. In some cases, PGR primers may be used to amplify signal for higher sensitivity. 

As will be ^predated by those in the art, nucleic acids can be attached or 
immobilized to a soUd support m a wide variety of ways. By "immobilized" and grammatical 
5 equivalents herein is meant the association or binding between the nucleic acid probe and the 
solid siQiport is sufficient to be stable under the conditions of binding, washing, analysis, and 
removal as outUned below. The binding can typically be covalent or non-covalent. By**non- 
covalent binding" and grammatical equivalents herein is typically meant one or more of 
electrostatic, hydrophilic, and hydrophobic interactions. Included in non-covalent binding is 

10 the covalent attachmmt of a molecule, such as, streptavidin to the support and the non- 
covalent binding of the biotinylated probe to the streptavidin. By "covalent binding** and 
grammatical equivalents herein is meant that the two moieties, the solid support and the 
probe, are attached by at least one bond, including sigma bonds, pi bonds and coordination 
bonds. Covalent bonds can be formed directly between the probe and the solid support or can 

15 be formed by a cross Imker or by inclusion of a specific reactive group on either the solid 
support or the probe or both molecules. Immobilization may also involve a combination of 
covalent and non-covalent interactions. 

In general, the probes are attached to a biochip in a wide variety of ways, as will be 
appreciated by those in the art As described h^ein, the nucleic acids can either be 

20 synthesized first, with subsequent attachment to the biochip, or can be directly synthesized on 
the biochip. 

The biochip comprises a suitable solid substrate. By "substrate" or "solid support" or 
other grammatical equivalents herein is meant a material that can be modified for the 
attachment or association of the nucleic acid probes and is amenable to at least one detection 

25 method. Often the substrate may contain discrete individual sites appropriate for ndivitual 
partitioning and identification. As will be appreciated by those in the art, the number of 
possible substrates are very large, and include, but are not limited to, glass and modified or 
functionalized glass, plastics (including acrylics, polystyrene and copolymers of styrene and 
other materials, polypropylene, polyethylene, polybutylene, polyurethanes. Teflon, etc.), 

30 polysaccharides, nylon or nitrocellulose, resins, silica or siUca-based materials including 
silicon and modified silicon, carbon, metals, inorganic glasses, plastics, etc. In general, the 
substrates allow optical detection and do not appreciably fluoresce. A preferred substrate is 
described in US application entitled Reusable Low Fluorescent Plastic Biochip, U.S. 
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Application Serial No. 09/270,214, filed March 15, 1999, herein incorporated by reference in 
its entirety. 

Generally the substrate is planar, although as wUl be appreciated by those in the art, 
other configurations of substrates may be used as well. For example, the probes may be 
5 placed on the inside surface of a tube, for flow-fiirough sample analysis to nMnimize sample 
volume. Similarly, the substrate may be flexible, such as a flexible foam, including closed 
cell foams made of particular plastics. 

hi a preferred embodiment, the sur&ce of the biochip and the probe may be 
derivatized with chemical fimctional groiq>s for subsequent attachment of the two. Thus, e.g., 

10 the biochip is derivatized with a chemical functional group including, but not limited to, 
amino groins, carboxy groiq>s, oxo groups and thiol groups, with amino groups being 
particularly preferred. Using ttiese fimctional groiq)s, the probes can be attached using 
functional groups on the probes. For example, nucleic acids containing amino groups can be 
attached to surfaces comprising amino groiqps, e.g., usmg linkers as are known in the art; e.g., 

1 5 homo-or hetero-bifimctional Imkers as are well known (see 1 994 Pierce Chemical Company 
catalog, technical section on cross-linkers, pages 155-200). In addition, in some cases, 
additional linkers, such as alkyl groups (including substituted and heteroalkyl groups) may be 
used. 

In this embodiment, oligonucleotides are synthesized, and then attached to the sur&ce 
20 of the soUd support. Either the 5 ' or 3 ' terminus may be attached to the solid support, or 

attachment may be via linkage to an internal nucleoside. 

In another embodiment, the immobilization to the solid support may be very strong, 

yet non-covalent. For example, biotinylated oligonucleotides can be made, which bind to 

surfaces covalently coated with streptavidin, resulting in attachment. 
25 Alternatively, the oligonucleotides may be synthesized on the surface, as is known in 

the art. For example, photoactivation techniques utiUzing photopolymerization compoxmds 

and techniques are used. In a preferred embodiment, the nucleic acids can be synthesized in 

situ, using known photolithographic techniques, such as those described in WO 95/251 16; 

WO 95/35505; U.S. Patent Nos. 5,700,637 and 5,445,934; and references cited within, all of 
30 which are expressly incorporated by reference; these methods of attachment form the basis of 

the AflEymetrix GeneChip™ technology. 

Often, amplification-based assays are performed to measure the expression level of 

lung cancer-associated sequences. These assays are typically performed ia conjunction with 

37 



wo 02/086443 PCT/US02/12476 
reverse transcription. In such assays, a lung cancer-associated nucleic acid sequence acts as a 

template in an amplification reaction (e.g., Polymerase Chain Reaction, or PGR). In a 

quantitative amplification, the amount of anq)lification product will be proportional to the 

amount of template in the original sample. Comparison to appropriate controls provides a 

5 measure of the amount of lung cancer-associated SNA. Methods of quantitative 

anq>lification are well known to those of skill in the art. Detailed protocols for quantitative 

PGR are provided, e.g., in Innis, et al. (1990) PGR Protocols. A Guide to Methods and 

Applications. 

hi some embodiments, a TaqMan based assay is used to measure expression. 

10 TaqMan based assays use a fiuorogenic oligonucleotide probe that contains a S' fluorescent 
dye and a 3' quenching agent The probe hybridizes to a PGR product, but cannot itself be 
extended due to a blocking agent at the 3 ' end. When flie PGR product is amplified in 
subsequent cycles, the 5' nuclease activity of the polymerase, e.g., AmpliTaq, results in the 
cleavage of the TaqMan probe. This cleavage separates fiie 5' fluorescent dye and the 3' 

15 quenching agent, thereby resulting in an increase in fluorescence as a fimction of 

amplification (see, e.g., literature provided by Petkin-Ehner, e.g., www2.peikm-elmer.com). 

Other suitable amplification mefliods include, but are not limited to, hgase chain 
reaction (LCR) (see Wu and Wallace (1989) Genomics 4:560, Landegren, et al. (1988) 
Science 241:1077, and Baxringer, et al. (1990) Gene 89:117), transcription ampUfication 

20 (Kwoh, et al. (1989) Proc. Natl. Acad. Sci. USA 86:1173), self-sustamed sequence 

replication (GuatelU, et al. (1990) Proc, Nat Acad Sci. USA 87: 1874), dot PGR, and linker 
adapter PGR, etc. 

Expression of lung cancer proteins from nucleic acids 

25 In a preferred embodiment, lung cancer nucleic acids, e.g., encoding lung cancer 

proteins, are used to make a variety of expression vectors to express lung cancer proteins 
which can then be used in screening assays, as described below. Expression vectors and 
recombinant DNA technology are well known to those of skill in the art (see, e.g., Ausubel, 
supra^ and Fernandez and HoefSer (eds 1999) Gene Expression Systems) and are used to 

30 express proteins. The expression vectors may be either self-replicating extrachromosomal 
vectors or vectors which integrate into a host genome. Generally, these expression vectors 
include transcriptional and translational regulatory nucleic acid operably linked to the nucleic 
acid encoding the lung cancer protein. The term "control sequences" refers to DNA 
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sequences used for the e3q)ression of an operably linked coding sequence in a particular host 
organism. Control sequences that are suitable for prokaryotes, e.g., include a promoter, 
optionally an operator sequence, and a ribosome binding site. Eukaryotic cells are kno\m to 
utilize promoters, polyadenylation signals, and enhancers. 
5 Nucleic acid is "operably linked" when it is placed into a functional relationship with 

another nucleic acid sequence. For example^ DNA for a presequence or secretory leader is 
operably linked to DNA for a polypeptide if it is e>q)ressed as a preprotein that participates in 
the secretion of the polypeptide; a promoter or enhancer is operably linked to a coding 
sequence if it affects the transcription of the sequence; or a ribosome binding site is operably 

1 0 linked to a coding sequmce if it is positioned so as to facilitate translation. Generally, 
"operably linked" means that the DNA sequences being linked are contiguous, and, in the 
case of a secretory leader, contiguous and in reading phase. However, enhancers do not have 
to be contiguous. Linking is typically accompUshed by ligation at convenient restriction 
sites. If such sites do not exist, synthetic oHgonucleotide adaptors or linkers are used in 

1 5 accordance with conventional practice. Transcriptional and translational regulatory nucleic 
acid will generally be appropriate to the host cell used to e3q)ress the lung cancer protein. 
Numerous types of expropriate expression vectors, and suitable regulatory sequences are 
known in flie art for a variety of host cells. 

In general, transcriptional and transliational regulatory sequences may include, but are 

20 not limited to, promoter sequences, ribosomal binding sites, transcriptional start and stop 
sequences, translational start and stop sequences, and enhancer or activator sequences. In a 
preferred embodiment, the regulatory sequences include a promoter and transcriptional start 
and stop sequences. 

Promoter sequences may be either constitutive or inducible promoters. The promoters 
25 may be either naturally occurring promoters or hybrid promoters. Hybrid promoters, which 
combine elements of more than one promoter, are also known in the art, and are useful in the 
present invention. 

In addition, an expression vector may comprise additional elements. For example, the 
expression vector may have two replication systems, thus allowing it to be maintained in two 
30 organisms, e.g., in mammalian or insect cells for expression and in a prokaryotic host for 
cloning and amplification. Furthermore, for integrating expression vectors, the expression 
vector often contains at least one sequence homologous to the host cell genome, and 
preferably two homologous sequences which flank the expression construct. The integrating 
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vector may be directed to a specific locus in the host cell by selecting the appropriate 
homologous sequence for inclusion in the vector. Constructs for integrating vectors are well 
known in the art (e.g., Fernandez and HoefQer, supra). 

In addition, in a preferred embodiment, the expression vector contains a selectable 
5 marker gene to allow the selection of transformed host cells. Selection genes are well known 
in the art and will vary with the host cell used. 

The lung cancer proteins of the present invention are usually produced by culturing a 
host cell transformed with an expression vector containing nucleic acid encoding a lung 
cancer protein, under the appropriate conditions to induce or cause e3q)ression of the lung 

1 0 cancer protein. Conditions appropriate for lung cancer protein expression will vary witii the 
choice of the e3q)ression vector and the host cell, and will be easily ascertained by one skilled 
in the art ^ugih routine experimentation or optimization. For example, the use of 
constitutive promoters m the esqjression vector will require optimizing the growth and 
proliferation of the host cell, i«*ile the use of an iuducible promoter requires the appropriate 

15 growth conditions for induction. In addition, in some embodiments, the timing of the harvest 
is important For example, the baculoviral systems used in insect cell expression are lytic 
viruses, and thus harvest time selection can be cmcial for product yield. 

Appropriate host cells include yeast, bacteria, archaebacteria, fimgi, and insect and 
animal cells, including mammalian cells. Of particular interest are Saccharomyces cerevisiae 

20 and other yeasts, E. colU Bacillus subtilis, Sf9 cells, C129 cells, 293 cells, Neurospora, BHK, 
CHO, COS, HeLa cells, HUVEC (human umbilical vein endothelial cells), THPl cells (a 
macrophage cell line) and various other human cells and cell lines. 

In a preferred embodiment, the lung cancer proteins are expressed in mammalian 
cells. Mammalian expression systems are also known in the art, and include retroviral and 

25 adenoviral systems. Of particular use as mammalian promoters are the promoters fi'om 
mammalian viral genes, since the viral genes are often highly expressed and have a broad 
host range. Examples include the SV40 early promoter, moxise mammary tumor virus LTR 
promoter, adenovirus major late promoter, herpes simplex virus promoter, and the CMV 
promoter (see, e.g., Fernandez and Hoefifler, supra). Typically, transcription termination and 

30 polyadenylation sequences recognized by mammalian cells are regulatory regions located 3' 
to the translation stop codon and thus, together with the promoter elements, flank the coding 
sequence. Examples of transcription tenninator and polyadenylation signals include those 
derived form SV40. 
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The methods of introducing exogenous nucleic acid into mammalian hosts, as weU as 
otherhosts,is weU known in the art, and wiU vary with the host cell used Techniques 
include dextran-mediated transfection, calcium phosphate precipitation, polybrene mediated 
transfection, protoplast fusion, electroporation, viral infection, encapsulation of the 
5 polyaucleotide(s) in liposomes, and direct microinjection of the DNA into nuclei. 

In a preferred embodiment, lung cancer proteins are expressed in bacterial systems. 
Promoters fiom bacteriophage may also be used and are known in the art. In addition, 
synthetic promoters and hybrid promoters are also usefid; e.g., tiie tac promoter is a hybrid of 
the trp and lac promoter sequences. Furthermore, a bacterial promoter can include naturally 

10 occurring promoters of non-bacterial origin that have the abihty to bind bacterial RNA 
polymerase and initiate transcription. In addition to a functioning promoter sequence, an 
efScient ribosome binding site is desirable. The expression vector may also include a signal 
peptide sequence that provides for secretion of the lung cancer protein in bacteria. The 
protein is either secreted into the growth media (gram-positive bacteria) or into the 

IS periplasmic space, located between the iim^ and outer membrane of the cell (gram-negative 
bacteria). The bacterial e?q)ression vector may also include a selectable marker gene to allow 
for the selection of bacterial strains that have been transformed. Suitable selection genes 
include genes which render die bacteria resistant to drugs such as ampicillin, 
chloramphenicol, erythromycin, kanamycin, neomycin and tetracycline. Selectable markers 

10 also include biosyntiietic genes, such as those in the histidine, tryptophan and leucine 

biosynthetic pathways. These components are assraibled into expression vectors. Expression 
vectors for bacteria are well known in the art, and include vectors for Bacillus subtilis, E. 
colt. Streptococcus cremoris, and Streptococcus lividans, among others (e.g., Fernandez and 
HoefiQer, supra). The bacterial expression vectors are t-ansformed into bacterial host cells 

i5 using techniques well known in the art, such as calcium chloride treatment, electroporation, 
and others. 

In one embodiment, lung cancer proteins are produced in insect cells. Expression 
vectors for the transformation of insect cells, and in particular, baculovirus-based expression 
vectors, are well known in the art. 
50 In a preferred embodiment, lung cancer protein is produced in yeast cells. Yeast 

expression systems are well known in the art, and include expression vectors for 
Saccltaromyces cerevisiae^ Candida albicans and C maltosa, Hansenula polymorpha. 
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Kluyveromyces fragilis and K. lactis, Pichia guillerimondii, and P. pastoris, 

Schizosaccharomyces pombe, and Yarrowia lipolytica. 

The Ixmg cancer protein may also be made as a fusion protein, using techniques well 

known in the art. Thus, e.g., for flie creation of monoclonal antibodies, if the desired q)itope 

5 is smaU,1he lung cancer protein may be fused to a canier protein to fonn an im^ 

Alternatively, the lung cancer protein may be made as a fusion protein to increase expression 

for affinity purification purposes, or for other reasons. For example, when flie lung cancer 

protein is a lung cancer peptide, the nucleic add encoding tiie peptide may be Imked to other 

nucleic acid for e3q)ression purposes. 

10 bi a preferred embodiment, the lung cancer protein is purified or isolated after 

expression. Lung cancer proteins may be isolated or purified in a variety of appropriate 
ways. Standard purification methods include electrophoretic,.molecular, inununological and 
chromatographic techniques, including ion exchange, hydrophobic, afiSnity, and reverse- 
phase HPLC chromatography, and chromatofocusing. For example, the lung cancer protein 

I S may be purified using a standard anti-lung cancer protein antibody column. Ultrafiltration 
and diafiltration techniques, in conjunction with protein concentration, are also useful. For 
general guidance m suitable purification techniques, see Scopes (1982) Protein Purification. 
The degree of pmification necessary will vary depending on the use of the lung cancer 
protein. In some instances no purification will be necessary. 

20 Once expressed and purified if necessary, the lung cancer proteins and nucleic acids 

are useful in a number of applications. They may be used as immunoselection reagents, as 
vaccine reagents, as screening agents, therapeutic entities, for production of antibodies, as 
transcription or translation inhibitors, etc. 

25 Variants of lung cancer proteins 

In one embodiment, the lung cancer proteins are derivative or variant lung cancer 
proteins as compared to the wild-type sequence. That is, as outlined more fully below, the 
derivative lung cancer peptide will often contain at least one amino acid substitution, deletion 
or insertion, with amino acid substitutions being particularly preferred. The amino acid 
30 substitution, insertion or deletion may occur at a particular residue within the lung cancer 
peptide. 

Also included within one embodiment of lung cancer proteins of the present invention 
are amino acid sequence variants. These variants typically faU into one or more of three 
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classes: substitutional, insertional or deletional variants. These variants ordinarily are 

prepared by site specific mutagenesis of nucleotides in the DNA encoding the lung cancer 

protein, using cassette or PGR mutagenesis or other techniques, to produce DNA encoding 

the variant, and thereafter expressing the DNA in recombinant cell culture as outlined above. 

5 However, variant lung cancer protein fragments having up to about 100-1 SO residues may be 

prepared by in vitro synthesis. Amino acid sequence variants are characterized by the 

predetermined nature of the variation, a feature that sets them apart from naturally occurring 

allelic or int«:species variation of the lung cancer protein amino acid sequence. The variants 

typically exhibit a similar qualitative biological activity as the naturally occurring analogue, 

10 althougih variants can also be selected which have modified characteristics as will be more 
fully outlined below. 

While the site or region for introducing an amino acid sequence variation is often 
predetermined, the mutation per se need not be predetermined. For example, in order to 
optimize the performance of a mutation at a given site, random mutagenesis may be 

1 5 conducted at the target codon or region and the expressed limg cancer variants screened for 
the optimal combination of desired activity. Techniques exist for makmg substitution 
mutations at predetermined sites in DNA having a known sequence, e.g., M13 primer 
mutagenesis and PGR mutagenesis. Screening of mutants is oftm done using assays of lung 
cancer protein activities. 

20 Amino acid substitutions are typically of single residues; insertions usually will be on 

the order of from about I to 20 amino acids, although considerably larger insertions may be 
occasionally tolerated. Deletions generally range from about 1 to about 20 residues, although 
in some cases deletions may be much larger. 

Substitutions, deletions, insertions or any combination thereof may be used to arrive 

25 at a final derivative. Generally these changes are done on a few amino acids to minimize the 
alteration of the molecule. Larger changes may be tolerated in certain circxunstances. When 
small alterations in the characteristics of a limg cancer protein are desired, substitutions are 
generally made in accordance with the amino acid substitution chart provided in the 
definition section. 

30 Variants typically exhibit essentially the same quaUtative biological activity and will 

elicit the same inunune response as a naturally-occurring analog, although variants also are 
selected to modify the characteristics of lung cancer proteins as needed. Alternatively, the 
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variant may be designed or reorganized such that a biological activity of the lung cancer 

protein is altered. For example, glycosylation sites may be added, altered, or removed. 

Covalent modifications of lung cancer polypeptides are included within the scope of 

this invention. One type of covalent modification includes reacting targeted amino acid 

S residues of a lung cancer polypeptide with an organic derivatizing agent that is equable of 

reacting with selected side chains or the N-or C-terminal residues of a lung cancer 

polypq>tide. Derivatization with bifimctional agents is usefiil, for instance, for crosslmldng 

lung cancer polypeptides to a water-insoluble support matrix or surface for use in a method 

for purifying anti-lung cancer polypeptide antibodies or screening assays, as is more fiiUy 

10 described below. Commonly used crosslinking agents include, e.g., l,l-bis(dia2oacetyl)-2- 
phenylethane, glutaraldehyde, N-hydroxysuccmimide esters, e.g., esters witii 4-azidosalicylic 
acid, homobifimctional imidoesters, including disuccinimidyl esters such as 3,3'- 
dithiobi5(succinimidylpiopionate), bifimctional maleimides such as bis-N-maleimido-1,8- 
octane and agents such as methyl-3-(0)-azidophenyl)dithio)propipmiidate. 

15 Other modifications include deamidation of glutaminyl and aspaiagmyl residues to 

the corresponding glutamyl and aspartyl residues, respectively, hydroxylation of proline and 
lysine, phosphorylation of hydroxyl groups of serinyl, threonyl or tyrosyl residues, 
methylation of the y-amino groups of lysine, argmine, and histidme side chains (Creighton 
(1983) Proteins: Structure and Molecular Properties, pp. 79-86), acetylation of the N-terminal 

20 amine, and amidation of any C-terminal carboxyl group. 

Another type of covalent modification of the lung cancer polypeptide encompassed by 
this invention is an altered native glycosylation pattern of the polypeptide. "Altering the 
native glycosylation pattern" is intended herein to mean adding to or deleting one or more 
carbohydrate moieties of a native sequence Ixmg cancer polypeptide. Glycosylation patterns 

25 can be altered in many ways. For example the use of different cell types to express lung 
cancer-associated sequences can result in different glycosylation patterns. 

Addition of glycosylation sites to lung cancer polypeptides may also be accomplished 
by altering the amino acid sequence thereof The alteration may be made, e.g., by the 
addition of, or substitution by, one or more serine or threonine residues to the native sequence 

30 lung cancer polypeptide (for 0-linked glycosylation sites). The limg cancer amino acid 
sequence may optionally be altered through changes at the DNA level, particularly by 
mutatmg the DNA encoding die lung cancer polypeptide at preselected bases such that 
codons are generated ttiat will translate into the deisired amino acids. 
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Another means of increasing the number of carbohydrate moieties on the lung cancer • 

polypeptide is by chemical or mzymatic coupling of glycosides to the polypeptide. Such 

methods are described in the art, e.g., in WO 87/05330, and in Aplin and Wriston (1981) 

CRC Crit. Rev. Biochem.. pp. 259-306. 

5 Removal of carbohydrate moieties present on the lung cancer polypeptide may be 

accomplished chemically or enzymaticaUy or by mutational substitution of codons encoding 

for amino add residues that serve as targets for glycosylation. Chemical deglycosyladon 

techniques are known in the art and described, for instance, by Hakimuddin, et al. (1987) 

Arch. Biochem. Biophvs.^ 259:52 and by Edge, et al. (1981) Anal. Biochem., 118:131. 

1 0 Enzymatic cleavage of carbohydrate moieties on polypeptides can be achieved by the use of a 

variety of endo-and exo-glycosidases as described by Thotakura, et al. (1987) Meth. 

EnzymoL, 138:350. 

Another type of covalent modification of lung cancer conoqprises linking the lung 
cancer polypq>tide to one of a variety of nonproteinaceous polymers, e.g,, polyethylene 

15 glycol, polypropylene glycol, or polyoxyalkylenes, in the manner set forth in U.S. Patent 
Nos, 4,640,835; 4,496,689; 4,301,144; 4,670,417; 4,791,192, or 4,179,337. 

Lung cancer polypeptides of the present invention may also be modified in a way to 
form chimeric molecules comprising a lung cancer polypeptide fused to another, 
heterologous polypeptide or amino acid sequence. In one embodiment, such a chimeric 

20 molecule comprises a fusion of a lung cancer polypeptide with a tag polypeptide which 
provides an epitope to which an anti-tag antibody can selectively bind. The epitope tag is 
generally placed at the amino-or carboxyl-terminus of the lung cancer polypeptide. The 
presence of such epitope-tagged forms of a lung cancer polypeptide can be detected using an 
antibody against the tag polypeptide. Also, provision of the epitope tag enables the lung 

25 cancer polypeptide to be readily purified by affinity purification using an anti-tag antibody or 
another type of affinity matrix that binds to the epitope tag. In an alternative embodiment, 
the chimeric molecule may comprise a fusion of a lung cancer polypeptide with an 
immunoglobulin or a particular region of an immxmoglobulin. For a bivalent form of the 
chimeric molecule, such a fusion could be to the Fc region of an IgG molecule. 

30 Various tag polypeptides and their respective antibodies are well known and examples 

include poly-histidine (poly-his) or poly-histidine-glycine (poly-his-gly) tags; HIS6 and metal 
chelation tags, the flu HA tag polypeptide and its antibody 12CA5 (Field, et al. (1988) Mol. 
CelL BioL 8:2159-2165); the c-myc tag and the 8F9, 3C7, 6E10, G4, B7 and 9E10 antibodies 
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theireto (Evan, et al. (1985) Molecular and Cellular Biology 5:3610-3616); and the Herpes 

Simplex vims glycoprotein D (gD) tag and its antibody (Paborsky, et al. (1990) Protein 

En|gineering 3(6):547-553). Other tag polypeptides include the Flag-peptide (Hopp, et al. 

(1988) BioTechnology 6:1204-1210); the KT3 epitope peptide (Martin, et al. (1992) Science 

5 '>^^'1 QO^^QAy tnhnliTi epitnpe peptide ^Skinner- et al. (1991^ J. Biol. Chem. 266:15163- 

15166); and flie T7 gene 10 protem peptide tag (Lutz-Freyermutii, et al. (1990) Proc, Nat'l 

Acad.ScLUSA 87:6393-6397). 

Also included are o&er lung cancer proteins of the lung cancer family, and lung 

cancer proteins fiom other organisms, which are cloned and pressed as outlined below. 

1 0 Thus, probe or degenerate polymerase chain reaction (PGR) primer sequences may be used to 
find other related limg cancer proteins fiom primates or other organisms. As will be 
appreciated by those in the art, particularly useful probe and/or PGR primer sequences 
include unique areas of the lung cancer nucleic add sequence. As is generally known in the 
art, preferred PGR primers are from about 15 to about 35 nucleotides in length, with from 

15 about 20 to about 30 bemg preferred, and may contain inosine as needed. PGR reaction 
conditions are well known in the art (e.g., Innis, PGR Protocols, siq}ra). 

Antibodies to lung cancer proteins 

In a preferred embodiment, when a lung cancer protem is to be used to generate 

20 antibodies, e.g., for immunotherapy or immunodiagnosis, the lung cancer protein should 
share at least one epitope or determinant with the Ml length protein. By "epitope" or 
"determinant" herein is typically meant a portion of a protein which will generate and/or bind 
an antibody or T-cell receptor in the context of MHC. Thus, in most instances, antibodies 
made to a smaller lung cancer protein will be able to bind to the fiilHength protein, 

25 particularly linear epitopes. In a preferred embodiment, the epitope is unique; that is, 
antibodies generated to a unique epitope show little or no cross-reactivity. 

Methods of preparing polyclonal antibodies are well known (e.g., Coligan, supra; and 
Harlow and Lane, supra). Polyclonal aiitibodies can be raised in a mammal, e.g., by one or 
more injections of an unmunizuig agent and, if desired, an adjuvant. Typically, the 

30 immunizing agent and/or adjuvant will be injected in the mammal by multiple subcutaneous 
or intraperitoneal injections. The immunizing agent may include a protein encoded by a 
nucleic acid of Tables lA-16 or fragment thereof or a fusion protein thereof. It n:iay be useful 
to conjugate the immunizing agent to a protein known to be immunogenic in the mammal 
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being inuuunized. inmunogenic proteins include, e.g., keyhole limpet hemocyanin, serum 
albumin, bovine thyroglobulin, and soybean trypsin inhibitor. Adjuvants include, e.g., 
Freund's complete adjuvant and MPL-TDM adjuvant (monophosphoryl Lipid A, synthetic 
trehalose dicorynomycolate). The immunization protocol may be selected by one skilled in 
5 the art 

The antibodies may, alternatively, be monoclonal antibodies. Monoclonal antibodies 
may be prepared using hybridoma methods, such as those described by Kohler and Milstein 
(1975) Nature 256:495. In a hybridoma method, a mouse, hamster, or other appropriate host 
animal, is typically immunized with an immunizing agent to elicit lymphocytes that produce 
10 or are capable of producing antibodies that will specifically bind to the immunizing agent. 
Alternatively, the lynq)hocytes may be immunized in vitro. The immunizing agent will 
typically include a polypeptide encoded by a nucleic acid of the tables, or fragment thereof, 
or a fusion protein thereof. Generally, either peripheral blood lymphocytes CTBLs") are 
used if cells of huxxuin origin are desired, or spleen cells or lymph node cells are used if non- 
15 human mammalian sources are desired. The lyn[q>hocytes are then fiised witii an . 

immortalized cell line using a suitable fusing agent, such as polyethylene glycol, to form a 
hybridoma cell (Coding (1986) Monoclonal Antibodies: Prmciples and Practice, pp. 59-103 ). 
Immortalized cell lines are usually transformed mammalian cells, particularly myeloma cells 
of rodent, bovin, or primate origm. Usually, rat or mouse myeloma cell lines are employed. 
20 The hybridoma cells may be cultured in a sxiitable culture medium that preferably contains 
one or more substances that inhibit the growth or survival of the unfused, immortalized cells. 
For example, if the parental cells lack the enzyme hypoxanthine guanine phosphoribosyl 
transferase (HGPRT or HPRT), the culture medium for the hybridomas typically will include 
hypoxanthine, aminopterin, and thymidine ("HAT medium")* which substances prevent ttie 
25 growth of HGPRT-deficient cells. 

In one embodiment, the antibodies are bispecific antibodies. Bispecific antibodies are 
typically monoclonal, preferably human or humanized, antibodies that have binding 
specificities for at least two different antigens or that have binding specificities for two 
epitopes on the same antigen. In one embodiment, one of the binding specificities is for a 
30 protein encoded by a nucleic acid of the tables or a firagment thereof, the other one is for any 
other antigen, and preferably for a cell-surface protem or receptor or receptor subunit, 
preferably one that is tumor specific. Alternatively, tetramer-type technology may create 
multivalent reagents. 
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In a preferred embodiment, the antibodies to lung cancer protein are capable of 

reducing or eliminating a biological function of a lung cancer protein, in a naked forai or 

conjugated to an effector moiety. That is, the addition of anti-lung cancer protein antibodies 

(either polyclonal or preferably monoclonal) to lung cancer tissue (or cells containing lung 

5 cancer) may reduce or eliminate the lung cancer. Generally, at least a 25% decrease in 

activity, growth, size or the like is preferred, with at least about 50% being particularly 

preferred and about a 95-100% decrease being especially preferred. 

In a preferred embodiment the antibodies to the lung cancer proteins are humanized 

antibodies (e.g,, Xenerex Biosciences, Medarex, Inc., Abgenix, Inc., Protein Design Labs, 

10 Inc.) Humanized forms of non-human (e.g., murine) antibodies are chimeric molecules of 
inimunoglobulins, immunoglobulin chains or fragments tiiereof (such as Fv, Fab, Fab% 
F(ab')2 or other antigen-binding subsequences of antibodies) which contain minimal 
sequence derived fix>m non-hun[ian immunoglobulin. Humanized antibodies include human 
immunoglobulins (recipient antibody) in which residues from a complementary determining 

15 region (CDR) of the recipient are replaced by residues from a CDR of a non-human species 
(donor antibody) such as mouse, rat or rabbit having the desired specificity, afQnity and 
capacity. In some instances, Fv framework residues of a human immunoglobulin are 
replaced by corresponding non-human residues. Humanized antibodies may also comprise 
residues which are found neither in the recipient antibody nor in flie imported CDR or 

20 framework sequences. In general, a humanized antibody will comprise substantially all of at 
least one, and typically two, variable domains, in which all or substantially all of the CDR 
regions correspond to those of a non-human immunoglobulin and all or substantially all of 
the framework (PR) regions are those of a human iromunoglobulin consensus sequence. A 
humanized antibody optimally also will typically comprise at least a portion of an 

25 immunoglobulin constant region (Fc), typically that of a human immunoglobulin (Jones, et 
al. (1986) Nature 321 :522-525; Riechmann, et al. (1988) Nature 332:323-329; and Presta . 
(1992) Curr. Op. Struct. Biol. 2:593-596). Humanization can be performed following the 
method of Winter and co-workeis (Jones, et al. (1986) Nature 321 :522-525; Riechmann, et al. 
(1988) Nature 332:323-327; Verhoeyen, et al. (1988) Science 239:1534-1536), by 

30 substituting rodent CDRs or CDR sequences for corresponding sequences of a human 

antibody. Accordingly, such humanized antibodies are chimeric antibodies (U.S. Patent No. 
4,816,567), whCTem substantially less than an intact human variable domain has bem 
substituted by corresponding sequence from a non-human species. 
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Human-like antibodies can also be produced using various techniques known m the 
art, including phage display libraries (Hoogenboom and Winter (1991) J. MoL BioL 227:381; 
Marks, et al. (1991) J. MoL BioL 222:581). The techniques of Cole, et al. and Boemer, et al. 
are also available for the preparation of human monoclonal antibodies (Cole, et al. (1985) 

5 Monoclonal Antibodies and Cancer Therapy, p. 77 and Boemer, et al. (1991) J. Tmmunol. 
147(l):86-95). Similarly, human antibodies can be made by introducing human 
immunoglobulin loci into transgenic animals, e.g., mice in which the endogenous 
immunoglobulin genes have been partially or completely inactivated. Upon challenge, 
human antibody production is observed, which closely resembles that seen in humans in 

10 nearly all respects, including gene rearrangement, assembly, and antibody repertoire. This 
approach is described, e.g., inU.S. Patent Nos. 5,545,807; 5,545,806; 5,569,825; 5,625,126; 
5,633,425; 5,661,016, and in the following scientific publications: Marks, et al. (1992) 
Bio/Technologv 10:779-783; Lonberg, et al. (1994) Nature 368:856-859; Monison (1994) 
Nature 368:812-13; Fishwild, et al. (1996) Nature Biotechnoloev 14:845-51; Neuberger 

15 Y1 QQ6^ Nature Biotechnology 1 ^'R^fi; anH T^nher^ and Huszar (1995) Tntftpi l^ev. Immunol. 
13:65-93. 

By immunotherapy is meant treatment of lung cancer with an antibody raised against 
a lung cancer proteins. As used herein, immunotherapy can be passive or active. Passive 
immunotherapy as defined herein is the passive transfer of antibody to a recipient (patient). 

20 Active immunization is the mduction of antibody and/or T-cell responses in a recipient 
(patient). Induction of an immune response is the result of providing the recipient with an 
antigen to which antibodies are raised. The antigen may be provided by injecting a 
polypeptide agamst which antibodies are desired to be raised into a recipient, or contacting 
the recipient with a nucleic acid capable of expressing the antigen and under conditions for 

25 e3q)ression of the antigen, leading to an immune response. 

In a preferred embodiment the lung cancer protems against which antibodies are 
raised are secreted proteins as described above. Without being bound by theory, antibodies 
used for treatment, may bind and prevent the secreted protem firom binding to its receptor, 
thereby inactivating the secreted lung cancer protein. 

30 In another preferred embodiment, the lung cancer protein to which antibodies are 

raised is a transmembrane protein. Without being bound by theory, antibodies used for 
treatment may bind the extracellular domain of the lung cancer protein and prevent it firom 
binding to other proteins, such as circulating ligands or cell-associated molecules. The 
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antibody may cause down-regulation of the transmembrane limg cancer protein. The 

antibody may be a competitive, non-competitive or uncompetitive inhibitor of protein binding 

to the ejctracellular domain of tiie Imig cancer protein. The antibody may be an antagonist of 

the limg cancer protein or may prevent activation of a transmembrane lung cancer protein, or 

5 may induce or siq>press a particular cellular pathway. In some embodiments, when the 

antibody prevents the binding of other molecules to the lung cancer protein, the antibody 

prevents growth of &e cell The antibody may also be used to target or sensitize the cell to 

cytotoxic agents, mcluding, but not limited to TNF-a, TNF-p, IL-1, INF-y, and IL-2, or 

chemother£q)eutic agents including 5FU, vinblastine, actinomycin D, cisplatin, methotrexate, 

1 0 and the like. In some instances the antibody may belong to a sub-type that activates serum 
complement when complexed with the transmembrane protein thereby mediating cytotoxicity 
or antigen-dq>endent cytotoxicity (ADCQ. Thus, lung cancer may be treated by 
administering to a patient antibodies directed against the transmembrane lung cancer protein. 
Antibody-labeling may activate a co-toxin, localize a toxin payload, or otherwise provide 

15 meaxis to locally ablate cells. 

In another preferred embodiment, the antibody is conjugated to an effector moiety. 
The effector moiety can be various molecules, including labeling moieties such as radioactive 
labels or fluorescent labels, or can be a therapeutic moiety. In one aspect the therapeutic 
moiety is a small molecule that modulates tiie activity of a lung cancer protein. Jn anoAer 

20 aspect the therapeutic moiety may modulate an activity of molecules associated with or in 
close proximity to a lung cancer proteiiL The therapeutic moiety may inhibit enzymatic or 
signaling activity such as protease or coUagenase activity associated with lung cancer. 

In a preferred embodiment, the therapeutic moiety can also be a cytotoxic agent. In 
this method, targeting the cytotoxic agent to lung cancer tissue or cells results in a reduction 

25 in the niunber of afflicted cells, thereby reducing symptoms associated with lung cancer. 

Cytotoxic agents are numerous and varied and include, but are not limited to, cytotoxic drugs 
or toxins or active fragments of such toxins. Suitable toxins and their corresponding 
fragments include diphtheria A chain, exotoxin A chain, ricin A chain, abrin A chain, curcin, 
crotin, phenomycin, enomycin, saporin, auristatin, and the like. Cytotoxic agents also include 

30 radiochemicals made by conjugating radioisotopes to antibodies raised against lung cancer 
proteins, or binding of a radionuclide to a chelating agent that has been covalently attached to. 
the antibody. Targeting the th^i^eutic moiety to transmembrane lung cancer proteins not 
only serves to increase the local concentration of therapeutic moiety in the lung cancer 
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afflicted area, but also serves to reduce deleterious side effects that may be associated with 
the untargeted ther^eutic moiety. 

In another preferred embodimait, the lung cancer protein against which the antibodies 
are raised is an intracellular protein. In this case, the antibody may be conjugated to a protein 

5 or other entity which facilitates entry into the cell In one case, the antibody enters the cell by 
endocytosis. In another embodiment, a nucleic acid encoding the antibody is administered to 
the individual or cell Moreover, wherein the lung cancer protein can be targeted within a 
cell, i.e., the nucleus, an antibody theretomay contain a signal for that target localization, i.e., 
a nuclear localization signal. 

10 The lung cancer antibodies of flie mvention specifically bind to lung cancer protems. 

By "specifically bmd" herein is meant that the antibodies bind to the protein with a Kd of at 
least about 0.1 mM, more usually at least about 1 nM, preferably at least about 0.1 ^iM or 
better, and most preferably, 0.01 nM or better. Selectivity of binding to the specific target 
and not to related other sequences is also important. 

15 

Detection of lung cancer sequence for diagnostic and therapeutic applications 

In one aspect, the KNA expression levels of genes are determined for diflBsrent 
cellular states m the lung cancer phenotype. Expression levels of genes in normal tissue (e.g., 
not undergoing lung cancer), in lung cancer tissue (and in some cases, for varying severities 

20 of lung cancer that relate to prognosis, as outlined below), or iu non-malignant disease are 
evaluated to provide expression profiles. A gene expression profile of a particular cell state 
or pomt of development is essentially a "fingerprint" of the state of the cell. While two states 
may have a particular gene similarly expressed, the evaluation of a number of genes 
simultaneously allows the generation of a gene expression profile that is reflective of the state 

25 of the cell. By comparing expression profiles of cells m different states, information 

regarding which genes are important (including both up- aiid down-regulation of genes) in 
each of these states is obtained. Then, diagnosis may be performed or confirmed to 
determine whether a tissue sample has the gene expression profile of normal or cancerous 
tissue. This will provide for molecular diagnosis of related conditions. 

30 "Differential expression," or grammatical equivalents as used herein, refers to 

qualitative or quantitative differences in the temporal and/or cellular gene expression 
patterns within and among cells and tissue. Thus, a differentially expressed gene can 
qualitatively have its expression altered, including an activation or inactivation, in, e.g.. 
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nonnal versus lung cancer tissue. Genes may be turned on or turned off in a particular state, 
relative to another state thus permitting comparison of two or more states, A qualitatively 
regulated gene will exhibit an expression pattern within a state or cell type which is 
detectable by standard techniques. Some genes will be expressed in one state or cell type, but 
5 not in both. Alternatively, the difference in expression may be quantitative, e.g., in that 
expression is increased or decreased; i.e., gene expression is either upregulated, resulting in 
an increased amount of transcript, or downregulated, resulting in a decreased amount of 
transcript. The degree to which expression differs need only be large enough to quantify via 
standard characterization techniques as outlined below, such as by use of Affymetrix 

10 GeneChip™ expression arrays, Lockhart (1996) Nature Biotechnology 14:1675-1680, hereby 
e^qpressly incorporated by reference. Other techniques include, but are not limited to, 
quantitative reverse transcriptase PGR, northern analysis and SNase protection. As outlined 
above, preferably the change in expression (Le., upregulation or downregulation) is typically 
at least about 50%, more preferably at least about 100%, more preferably at least about 

15 150%, more prefer^ly at least about 200%, with from 300 to at least 1000% being especially 
preferred. 

Evaluation may be at the gene transcript or the protein level The amount of gene 
expression may be monitored using nucleic acid probes to the KNA or DNA equivalent of the 
grae transcript, and the quantification of gene expression levels, or, alternatively, the final 

20 gene product itself (protem) can be monitored, e.g., with antibodies to the lung cancer protein 
and standard immunoassays (ELISAs, etc.) or other techniques, including mass spectroscopy 
assays, 2D gel electrophoresis assays, etc. Proteins corresponding to lung cancer genes, e.g., 
those identified as being important in a lung cancer or disease phenotype, can be evaluated in 
a lung cancer diagnostic test In a preferred embodiment, gene expression monitoring is 

25 performed simultaneously on a number of genes. 

The lung cancer nucleic acid probes may be attached to biochips as outlined herein for 
the detection and quantification of lung cancer sequences in a particular cell. The assays are 
further described below in the example. PGR techniques can be used to provide greater 
sensitivity. Multiple protein expression monitoring can be performed as well. Similarly, 

30 these assays may be performed on an individual basis as well. 

In a preferred embodiment nucleic acids encoding the lung cancer protein are 
detected. Although DNA or RNA encoding the limg cancer protein may be detected, of 
particular interest are methods wherein an mRNA encoding a lung cancer protein is detected. 
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Probes to detect mRNA can be a nucleotide/deoxynucleotide probe that is complementary to 
and hybridizes with llie mRNA and includes, but is not limited to, oligonucleotides, cDNA or 
RNA. Probes also should contain a detectable label, as defined herein. In one method the 
noRNA is detected after immobilizing the nucleic acid to be exanmed on a solid support such 
5 as nylon membranes and hybridizing the probe with the sample. Following washing to 
remove the non-specifically bound probe, the label is detected. In another method detection 
of the mRNA is performed in situ. In this me&od permeabilized cells or tissue samples are 
contacted with a detectably labeled nucleic acid probe for sufficient time to allow the probe 
to hybridize with the target mRNA. Following washing to remove the non-specifically bound 

10 probe, the label is detected. For example a digoxygenin labeled riboprobe (RNA probe) that 
is complementary to the mRNA encoding a lung cancer protein is detected by binding the 
digoxyg^iin with an anti-digoxygenin secondaiy antibody and developed with nitro blue 
tetrazolium and 5-bromo-4«^hloio-3-indoyl phosphate. 

In a preferred embodiment, various proteins bom the three classeis of proteins as 

IS described herein (secreted, transmembrane or intracellular proteins) are used in diagnostic 
assays. The lung cancer proteins, antibodies, nucleic acids, modified proteins and cells 
containing limg canc^ sequences are used in diagnostic assays. This can be performed on an 
individual gene or corresponding polypeptide level. In a preferred embodiment, the 
expression profiles are used, preferably in conjunction with high throughput screening 

20 techniques to allow monitoring for expression profile genes and/or corresponding 
polypeptides. 

As described and defined herein, lung cancer proteins, including intracellular, 
transmembrane, or secreted proteins, find use as markers of lung cancer, e.g., for prognostic 
or diagnostic purposes. Detection of these proteins in putative lung cancer tissue allows for 

25 detection, prognosis, or diagnosis of lung cancer or similar disease, and perhaps for selection 
of therapeutic strategy. In one embodiment, antibodies are used to detect lung cancer 
proteins. A preferred method separates proteins firom a sample by electrophoresis on a gel 
(typically a denatmiig and reducing protein gel, but may be another type of gel, including 
isoelectric focusing gels and the Uke). Following separation of proteins, the lung cancer 

30 protein is detected, e.g., by immunoblotting with antibodies raised against the lung cancer 
protein. Methods of immimoblotting are weU known to those of ordinary skill in the art. 

In another preferred method, antibodies to the lung cancer protein find use in in situ 
ims^g techniques, e.g., in histology (e.g., Asai (ed. 1993) Methods in Cell Biology: 
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Antibodies in Cell Biology, volume 37. In this method cells are contacted with fiom one to 

many antibodies to the lung cancer protein(s). Following washing to remove non-specific 

antibody binding, the presence of the antibody or antibodies is detected. In one embodiment 

the antibody is detected by incubating with a secondary antibody that contains a detectable 

5 label, e.g., multicolor fluorescence or confocal imaging. In another method the primaiy 

antibody to the lung cancer protein(s) contains a detectable label, e.g., an enzyme marker that 

can act on a substrate. In another preferred embodiment each one of multiple primary 

antibodies contains a distinct and detectable label. This method finds particular use in 

simultaneous scre^iing for a plurality of lung cancer protems. Many other histological 

1 0 imaging techniques are also provided by the invention. 

In a preferred embodiment the label is detected in a fluorometer which has the ability 
to detect and distinguish emissions of diff^ent wavelengths. In addition, a fluorescence 
activated cell sorter (PACS) can be used in the method. 

In another preferred embodiment, antibodies find use in diagnosing lung cancer firom 

15 blood, serum, plasma, stool, and ottier sang)les. Such sanaples, therefore, are useful as 
samples to be probed or tested for the presence of lung cancer proteins. Antibodies can be 
used to detect a lung cancer protein by previously described immunoassay techniques 
including ELISA, immunoblotting (western blotting), immunoprecipitation, BIACORE 
technology and the like. Conversely, the presence of antibodies may indicate an immune 

20 response against an endogenous lung cancer protein or vaccine. 

In a preferred embodiment, in situ hybridization of labeled lung cancer nucleic acid 
probes to tissue arrays is done. For example, arrays of tissue samples, including lung cancer 
tissue and/or normal tissue, are made. In situ hybridization (see, e.g., Ausubel, supra) is then 
performed. When comparing the fingerprints between an individual and a standard, the 

25 skilled artisan can make a diagnosis, a prognosis, or a prediction based on the findings. It is 
further understood that the genes which indicate the diagnosis may differ fix)m those which 
indicate the prognosis and molecular profiling of the condition of the cells may lead to 
distinctions between responsive or refiractory conditions or may be predictive of outcomes. 
In a preferred embodiment, the lung cancer proteins, antibodies, nucleic acids, 

30 modified proteins and cells containing lung cancer sequences are used in prognosis assays. 
As above, gene expression profiles can be generated that correlate to lung cancer, clinical, 
pathological, or other information, in terms of long term prognosis. Again, this may be done 
on either a protein or gene level, wifh the use of genes being preferred. Single or multiple 
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genes may be use&l in various combinations. As above, lung cancer probes may be attached 
to biochips for the detection and quantification of lung cancer sequences in a tissue or patient 
The assays proceed as outlined above for diagnosis. PCR method may provide more 
sensitive and accurate quantification. 

5 

Assays for therapeutic compounds 

In a preferred embodiment, file proteins, nucleic acids, and antibodies as described 
herein are used in drug screenmg assays. The lung cancer proteins, antibodies, nucleic acids, 
modified proteins and cells containing lung cancer sequences are used in drug screening . 

10 assays or by evaluating the effect of drug candidates on a "gene e3q)ression profile" or 
expression profile of polypeptides. In a preferred embodimmt, the expression profiles are 
used, preferably in conjunction with high throughput screening techniques to allow 
monitoring for expression profile genes after treatment with a candidate agent (e.g., 
Zlokamik, et al. (1998) Scimce 279:84-8; Hdd (1996) Genome Res, 6:986-94. 

15 In a preferred embodiment, the lung cancer protems, antibodies, nucleic acids, 

modified proteins and cells containing the native or modified lung cancer proteins are used in 
screening assays. That is, the present invention provides novel methods for screening for 
compositions which modulate the lung cancer phenotype or an identified physiological 
fimction of a lung cancer protein. As above, this can be done on an individual gene level or 

20 by evaluating the effect of drug candidates on a "gene expression profile". In a preferred 

embodiment, the expression profiles are used, preferably in conjunction with high throughput 
screening techniques to allow monitoring for expression profile genes after treatment with a 
candidate agent, see Zlokamik, supra. 

Having identified differentially expressed genes herein, a variety of assays may be 

25 performed. In a preferred embodiment, assays may be run on an individual gene or protein 
level. That is, having identified a particular gene with altered regulation in lung cancer, test 
compounds can be screened for the abihty to modulate gene expression or for binding to the 
limg cancer protein. ^Modulation" thus includes an increase or a decrease in gene 
expression. The preferred amount of modulation will depend on the origioal change of the 

30 gene expression in normal versus tissue undergoing lung cancer, with changes of at least 
10%, preferably 50%, more preferably 100-300%, and in some embodiments 300-1000% or 
greater. Thus, if a gene exhibits a 4-fold increase in lung cancer tissue compared to normal 
tissue, a decrease of about four-fold is often desired; similarly, a 10-fold decrease in lung 
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cancer tissue compared to normal tissue often provides a target value of a 10-fold increase in 

egression to be induced by the test compound. 

The amount of gene expression may be monitored using nucleic acid probes and the 

quantification of gene egression levels, or, altematively, the gene product itself can be 

.5 monitored, e.g., through the use of antibodies to the lung cancer protein and standard 

immunoassays. Proteomics and separation techniques may also allow quantification of 

expression. 

hi a preferred embodimmt, gene or protein expression monitoring of a number of 
entities, i.e., an e3q)Tession profile, is monitored sunultaneously. Such profiles will typically 
10 involve a plurality of those entities described herein. 

In this embodiment, the lung cancer nucleic acid probes are attached to biochips as 
outlined h^in for the detection and quantification of lung cancer sequences in a particular 
ceU. Altematively, pen may be used. Thus, a series, e.g., ofmicrotiter plate, may be used 
with dispensed primers in desired wells. A PCS. reaction can tiien be performed and analyzed 
15 for each well. 

Expression monitoring can be performed to identify compounds that modify the 
expression of one or more lung cancer-associated sequences, e.g., a polynucleotide sequence 
set out in the tables. Generally, in a preferred embodiment, a test compound is added to the 
cells prior to analysis. Moreover, screens are also provided to identify agents that modulate 

20 lung cancer, modulate lung cancer proteins, bmd to a lung cancer protein, or interfere with 
the binding of a lung cancer protein and an antibody, substrate, or other binding partner. 

The term "test compound" or "drug candidate" or **modulator^' or grammatical 
equivalents as used herein describes a molecule, e.g., protein, oligopeptide, small organic 
molecule, polysaccharide, polynucleotide, etc., to be tested for the capacity to directly or^ 

25 indirectly alter the lung cancer phenotype or the expression of a luiig cancer sequence, e.g., a 
nucleic acid or protein sequence. In preferred embodiments, modulators alter expression 
profiles of nucleic acids or proteins provided herein. In one embodiment, the modulator 
suppresses a lung cancer phenotype, e.g., to a normal or non-maUgnant tissue fingerprint. In 
another embodiment, a modulator induces a lung cancer phenotype. Generally, a plurality of 

30 . assay mixtures are run in parallel with diflFereht agent concentrations to obtain a differential 
' response to the various concentrations. Typically, one of these concentrations serves as a 
negative control, i.e., at zero concentration or below the level of detectiorL 
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In one aspect, a modulator will neutralize the effect of a lung cancer proteiru By 

**neutralize" is meant that activity of a protein and the consequent eflFect on the cell is 

inhibited or blocked. 

In certain embodiments, combinatorial libraries of potential modulators will be 

5 screened for an ability to bind to a lung cancer polypeptide or to modulate activity. 

Conventionally, new chemical entities with useful properties are generated by identifying a 

chemical compound (called a "lead compound") with some desirable property or activity, 

e.g., inhibiting activity, creating variants of the lead compound, and evaluating the property 

and activity of those variant compounds. Often, higji throughput screening (HTS) methods 

10 are enq)loyed for such an analysis. 

In one preferred embodiment, high throughput screening methods involve providing a 
library containing a large number of potential fherq)eutic compounds (candidate 
compounds). Such "combinatorial chemical libraries" are then screened in one or more 
assays to identify tiiose library members (particular ch^iical species or subclasses) that 

IS disfplay a desired characteristic activity. The compounds thus identified can serve as 

conventional "lead compounds" or can themselves be used as potential or actual then^eutics. 

A combinatorial chemical library is a collection of diverse chemical compounds 
generated by either chemical synthesis or biological synth^s by combining a number of 
chemical •'building blocks" such as reagents. For example, a linear combinatorial chemical 

10 library, such as a polypeptide (e.g., mutein) hbraiy, is formed by combining a set of chemical 
building blocks called amino acids in every possible way for a given compound length (i.e., 
the number of amino acids in a polypeptide compound). Millions of chemical compounds 
can be synthesized through such combinatorial mixing of chemical building blocks (Gallop, 
et al. (1994) J. Med. Chem. 37(9):1233-1251). 

15 Preparation and screening of combinatorial chemical libraries is well known to those 

of skill in the art. Such combinatorial chemical libraries include, but are not limited to, 
peptide Ubraries (see, e.g., U.S. Patent No. 5,010,175, Furka (1991) Pept. Prot Res. 37:487- 
493, Houghton, et al. (1991) Nature. 354:84-88), peptoids (PCT Publication No WO 
91/19735), encoded peptides (PCX PubUcation WO 93/20242), random bio-oligomers (PCX 

;0 Publication WO 92/00091), benzodiazepines (U.S. Pat. No. 5,288,514), diversomers such as 
hydantoins, benzodiazepines and dipeptides (Hobbs, et al. (1993) Proc. Nat. Acad. Sci. USA 
90:6909-6913), vinylogous polypeptides (Hagihara, et al. (1992) J. Amer. Chem. Soc. 
1 14:6568), nonpeptidal peptidomimetics with a Beta-D-Glucose scaffolding (Elirscfamann, et 
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al. (1992) J. Amer. Chem. Soc. 1 14:9217-9218), analogous organic syntheses of small 

compound libraries (Chen, et al. (1994) J. Amer. Chem. Soc. 116:2661), oligocarbamates 

(Cho, et al. (1993) Science 261 :1303), and/or peptidyl phosphonates (Campbell, et al. (1994) 

J. Org, Chem. 59:658). See, generally, Gordon, et al. (1994) J. Med. Chem. 37:1385, nucleic 

5 acid libraries (see, e.g., Stratagene, Corp.), pq)tide nucleic acid libraries (see, e.g., U.S. 

Patent 5,539,083), antibody libraries (see, e.g., Vaughn, et al. (1996) Nature Biotechnology 

14(3):309-314, and PCT/US96/10287), carbohydrate Ubraries (see, e.g., Liang, et al. (1996) 

Science 274:1520-1522, and U.S. Patent No. 5,593,853), and small organic molecule libraries 

(see, e.g., benzodiazepines, Baum (1993) C&EN, Jan 18, page 33; isoprenoids, U.S. Patent 

10 No. 5,569,588; thiazolidinones and metathiazanones, U.S. Patent No. 5,549,974; pyrrolidines, 
U.S. Patent Nos. 5,525,735 and 5,519,134; morpholino compounds, U.S. Patent No. 
5,506,337; benzodiazepines, U.S. Patent No. 5,288,514; and the like). 

Devices for the preparation of combinatorial libraries are commercially available (see, 
e.g., 357 MPS, 390 MPS, Advanced Chem Tech, Louisville KY, Synq)hony, Rainin, 

15 Wobum, MA, 433A Applied Biosystems, Foster City, CA, 9050 Plus, Millipore, Bedford, 
MA). 

A number of well known robotic systems have also been developed for solution phase 
chemistries. These systems mclude automated workstations like the automated synthesis 
apparatus developed by Takeda Chemical Industries, LTD. (Osaka, Japan) and many robotic 

20 systems utilizing robotic arms (Zymate H, Zymark Coiporation, Hopkjnton, Mass.; Qrca, 
Hewlett-Packard, Palo Alto, Calif), which mimic the manual synthetic operations performed 
by a chemist. The above devices, with appropriate modification, are suitable for use with the 
present invention. In addition, numerous combinatorial Ubraries are themselves 
commercially available (see, e.g., ComGenex, Princeton, N.J., Asinex, Moscow, Ru, Tripos, 

25 Inc., St. Louis, MO, ChemStar, Ltd, Moscow, RU, 3D Pharmaceuticals, Exton, PA, Martek 
Biosciences, Columbia, MD, etc.). 

The assays to identify modulators are amenable to high throughput screening. 
Preferred assays thus detect modulation of lung cancer gene transcription, polypeptide 
e}q>ression, and polypeptide activity. 

30 High throughput assays for evaluating the presence, absence, quantification, or other 

properties of particular nucleic acids or protein products are well known to those of skill in 
the art. Similarly, binding assays and reporter gene assays are similarly well known. Thus, 
e.g., U.S. Patent No. 5,559,410 discloses high througlq)ut screening methods for proteins. 
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U.S. Patent No. 5,585,639 discloses high throughput screening methods for nucleic add 
binding (i.e., in arrays), while U.S. Patent Nos. 5,576,220 and 5,541,061 disclose high 
throughput methods of screening for ligand/antibody binding. 

la addition, high throughput screening systems are commercially available (see, e.g., 
Zymark Corp., Hopkinton, MA; Air Technical Industries, Mentor, OH; Beckman 
Instrumeats, Inc. Fullerton, CA; Precision Systems, lac, Natick, MA, etc.). These systems 
typically automate procedures, includmg sample and reagent pipetting, liquid dispensing, 
tuned incubations, and final readings of the microplate in detector(s) appropriate for the 
assay. These configurable systems provide high throughput and rapid start up as well as a 
high degree of flexibility and customization. The manufacturers of such systems provide 
detailed protocols for various high throughput systems. Thus, e.g., Zymark Corp. provides 
technical bulletins describing screemng systems for detecting the modulation of gene 
transcription, ligand binding, and the like. 

In one embodiment, modulators are proteins, often naturally occurring proteins or 
fir^ments of naturally occurring proteins. Thus, e.g., cellular extracts containing proteins, or 
raridom or directed digests of proteinaceous cellular extracts, may be used. In this way 
libraries of proteins may be made for screening in the methods of the invention. Particularly 
preferred in this embodiment are libraries of bacterial, fungal, viral, and mammalifln proteins, 
with the latter being preferred, and human proteins being especially preferred. Particularly 
useful test confound will be directed to the class of proteins to which the target belongs, e.g., 
substrates &>r enzymes or ligands and receptors. 

In a preferred embodiment, modulators are peptides of fi-om about 5 to about 30 
amino acids, with fit>m about 5 to about 20 amino acids being preferred, and fi-om about 7 to 
about 15 being particularly preferred. The peptides may be digests of naturally occurring 
proteins, random peptides, or *l)iased" random peptides. By "randomized" or grammatical 
equivalents herein is meant that the nucleic acid or peptide consists of essentially random 
sequences of nucleotides and amino acids, respectively. Since these random peptides (or 
nucleic acids, discussed below) are often chemically synthesized, they may incorporate a 
nucleotide or amino acid at any position. The synthetic process can be designed to generate 
randomized proteins or nucleic acids, to allow the formation of all or most of the possible 
combinations over the length of the sequence, thus forming a library of randomized candidate 
bioactive proteinaceous agents. 
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In one embodiment, the library is fiilly randomized, with no sequence preferences or 
constants at any position. In a preferred embodiment, the library is biased. That is, some 
positions within the sequence are either held constant, or are selected fix)m a limited number 
of possibilities. In a preferred embodiment, the nucleotides or amino acid residues are 
S randomized within a defined class, e.g., of hydrophobic amino acids, hydrophilic residues, 
sterically biased (either small or large) residues, towards the creation of nucleic add binding 
domains, the creation of cysteines, for cross-linking, prolines for SH-3 domains, serines, 
threonines, tyrosines or histidines for phosphorylation sites, etc. 

Modulators of lung cancer can also be nucleic acids, as defined above. 

10 As described above generally for proteins, nucleic acid modulating agents may be 

naturally occurring nucleic acids, random nucleic acids, or *T)iased" random rmcleic acids. 
Digests of piDcaryotic or eucaryotic genomes may be used as is outlined above for proteins. 

In a preferred embodiment, the candidate compounds are organic chemical moieties, a 
wide variety of which are available in the literature. 

1 S After a candidate agent has been added and the cells allowed to incubate for some 

period of time, the sample containing a target sequence is analyzed. If required, the target 
sequence is prepared using known techniques. For example, the sample may be treated to 
lyse the cells, using known lysis buffers, electroporation, etc., with purification and/or 
amplification such as PGR performed as appropriate. For example, an in vitro transcription 

20 with labels covalently attached to the nucleotides is performed. Generally, the nucleic acids 
are labeled with biotin-FITC or PE, or with cy3 or cy5. 

In a preferred embodiment, the target sequence is labeled with, e.g., a fluorescent, a 
chemiluminescent, a chemical, or a radioactive signal, to provide a means of detecting the 
target sequence's specific binding to a probe. The label also can be an enzyme, such as, 

25 alkaline phosphatase or horseradish peroxidase, which when provided with an appropriate 
substrate produces a product that can be detected. Altematively, the label can be a labeled 
compound or small molecule, such as an enzyme inhibitor, that binds but is not catalyzed or 
altered by the enzyme. The label also can be a moiety or compound, such as, an epitope tag 
or biotin which specifically binds to streptavidiiL For the example of biotin, the streptavidin 

30 is labeled as described above, thereby, providing a detectable signal for the bound target 
sequence. Unbound labeled streptavidin is typically removed prior to analysis. 

Nucleic acid assays can be direct hybridization assays or can comprise "sandwich 
assays", which include the use of multiple probes, as is generally outlined in U.S. Patent Nos. 
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5,681,702, 5,597,909, 5,545,730, 5,594,117, 5,591,584, 5,571,670, 5,580,731, 5,571,670, 

5,591,584, 5,624,802, 5,635,352, 5,594,118, 5,359,100, 5,124,246 and 5,681,697, aU of 

which are hereby incorporated by reference. In this embodiment, in general, the target nucleic 

acid is prepared as outlined above, and then added to the biochip comprising a plurality of 

5 nucleic acid probes, under conditions that allow the formation of a hybridization complex. 

A variety of hybridization conditions may be used in the present invention, including 

high, moderate and low stringency conditions as outlined above. The assays are generally 

run under stringency conditions which allow formation of the label probe hybridization 

complex only in the presmce of target. Stringency can be controlled by altering a step 

10 parameter that is a thermodynamic variable, mcluding, but not limited to, temperature, 
fonnamide concentration, salt concentration, chaotropic salt concmtration^ pH, organic 
solvent concentration, etc. 

These parameters may also be used to control non-specific binding, as is generally 
outlined m U.S. Patent No, 5,681,697: Thus it may be desirable to perform certain steps at 

15 higiher stringency conditions to reduce non-specific binding. 

The reactions outlined herein may be accomplished in a variety of ways. Cpnq>onents 
of the reaction may be added simultaneously, or sequentially, in different orders, with 
preferred embodiments outlined below. In addition, tiie reaction may include a variety of 
other reagents. These include salts, buffers, neutral proteins, e.g., albumin, detergents, etc. 

20 which may be used to facilitate optimal hybridization and detection, and/or reduce non- 
specific or background interactions. Reagents that otherwise improve the efficiency of the 
assay, such as protease inhibitors, nuclease inhibitors, anti-microbial agents, etc., may also be 
used as appropriate, depending on the sample preparation methods and purity of the target 
The assay data are analyzed to determine the expression levels, and changes in 

25 expression levels as between states, of individual genes, forming a gene expression profile. 

Screens are performed to identify modulators of the lung cancer phenotype. In one 
embodiment, screening is performed to identify modulators that can induce or suppress a 
particular expression profile, thus preferably generating the associated phenotype. In another 
embodiment, e.g., for diagnostic applications, having identified differentially expressed genes 

30 important in a particular state, screens can be performed to identify modulators that alter 

expression of individual genes. In an another embodiment, screening is performed to identify 
modulators that alter a biological function of the expression product of a differentially 
expressed gene. Again, having identified the importance of a gene in a particular state. 
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screens are performed to identify agents that bind and/or modulate the biological activity of 
the gene product, or evaluate genetic polymorphisms. 

Genes can be screened for those that are induced in response to a candidate agent 
After identifying a modulator based vpon its ability to suppress a limg cancer e}q>ression 
5 pattern leading to a normal expression pattern, or to modulate a single lung cancer gene 
expression profile so as to mimic the ^ression of the gene fi:om nonnal tissue, a screen as 
described above can be performed to identify genss that are specifically modulated in 
response to the agent. Comparing expression profiles between normal tissue and agent 
treated lung cancer tissue reveals genes that are not expressed in normal tissue or lung cancer 

10 tissue, but are expressed m agent treated tissue. These agent-specific sequmces can be 
identified and used by methods described h^in for lung cancer genes or proteins. Jn 
particular these sequences and die proteins they encode find use in marking or identifying 
agent treated cells. In addition, antibodies can be raised against the agent induced proteins 
and used to target novel therapeutics to the treated lung cancer tissue sample. 

1 s Thus, in one embodiment, a test compound is administered to a population of lung 

cancer cells, that have an associated lung cancer expression profile. By "administration" or 
"contacting'' herein is meant that the candidate agent is added to the cells in such a maimer as 
to allow the agent to act upon the cell, whether by uptake and uitracellular action, or by 
action at the cell surface. In some embodiments, nucleic acid encoding a proteinaceous 

20 candidate agent (i.e., a peptide) may be put into a viral construct such as an adenoviral or 
retroviral construct, and added to die cell, such that expression of the peptide agent is 
accomplished, e.g., PCT US97/01019; Regulatable gene therapy systems can also be used. 

Once a test compoimd has been administered to the cells, the cells can be washed if 
desired and are allowed to incubate under preferably physiological conditions for some 

25 period of time. The cells are then harvested and a new gene expression profile is generated, 
as outlined herein. 

Thus, e.jg., lung cancer or non-maUgnant tissue may be screened for agents that 
modulate, e.g., induce or suppress a lung cancer phenotype. A change in at least one gene, 
preferably many, of the expression profile indicates that the agent has an effect on lung 
30 cancer activity. By defining such a signature for the lung cancer phenotype, screens for new 
drugs that alter the phenotype can be devised. With this approach, the drug target need not be 
known and need not be represented in the original expression screening platform, nor does 
flie level of transcript for the target protein need to change. 
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Measure of lung cancer polypq)tide activity, or of lung cancer or the lung cancer 

phenotype can be performed using a variety of assays. For example, the eflFects of flie test 

compoimds \xpon the function of the metastatic polypeptides can be measured by examining 

parameters described above. A suitable physiological change that afifects activity can be used 

to assess the influence of a test compound on the polypeptide of this invention. When the 

functional consequences are determined using intact cells or animals, one can also measure a 

variety of effects such as, in the case of lung cancer associated with tumors, tumor growth, 

tumor metastasis, neovascularization, hormone release, transcriptional changes to both known 

and imcharacterized genetic markers (e.g,, northem blots), changes in cell metaboKsm such as 

cell growth or pH changes, and changes in intracellular second messengers such as cGMP. In 

the assays of the invention, mammalian lung cancer polypeptide is ^ically used, e.g., 

mouse, preferably human. 

Assays to identify compounds with modulating activity can be performed in vitro. 

For example, a lung cancer polypeptide is first contacted with a potential modulator and 

mcubated for a suitable amount of tune, e.g., &om 0.5 to 48 hours. In one embodiment, the 

lung cancer polypeptide levels are determined in vitro by measuring the level of protein or 

mRNA. The level of protein is typically measured using immunoassays such as western 

blotting, EIJSA and the like with an antibody that selectively binds to the lung cancer 

polypeptide or a fragment thereof. For measurement of mRNA, amplification, e.g., using 

PGR, LCR, or hybridization assays, e.g., northem hybridization, RNAse protection, dot 

blotting, are preferred. The level of protein or mRNA is typically detected using directly or 

indirectly labeled detection agents, e.g., fluorescently or radioactively labeled nucleic acids, 

radioactively or en^matically labeled antibodies, and the like, as described herein. 

Altematively, a reporter gene system can be devised using a lung cancer protein 

promoter operably linked to a reporter gene such as luciferase, green fluorescent protein, 

CAT, or P-gal. The reporter construct is typically traosfected into a cell. After treatment 

with a potential modulator, the amoimt of reporter gene transcription, translation, or activity 

is measured according to standard techniques known to those of skill in the art. 

In a preferred embodiment, as outlined above, screens may be done on individual 

genes and gene products (proteins). That is, having identified a particular differentially 

expressed gene as important in a particular state, screening of modulators of the expression of 

the gene or the gene product itself can be done. The gene products of differentially expressed 
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genes are sometimes referred to herein as **lmig cancer proteins." The lung cancer protein 

may be a fragment, or alternatively, be the full length protein to a fragment shown herein. 

In one embodiment, screening for modulators of expression of specific genes is 

performed. Typically, the e>q)ression of only one or a few genes are evaluated. In another 

5 embodiment, screens are designed to first find compounds that bind to differentially 

expressed proteins. These compounds are then evaluated for the ability to modulate 

differentially expressed activity. Moreover, once initial candidate compounds are identified, 

variants can be further screened to better evaluate structure activity relationships. 

In a preferred embodiment, binding assays are done. In general, purified or isolated 

10 gene product is used; that is, the gene products of one or more differentially e}q)ressed 

nucleic acids are made. For example, antibodies are generated to the protein gene products, 

and standard immunoassays are run to determine the amount of protein present Alternatively, 

cells comprising the lung cancer proteins can be used in the assays. 

Thus, in a preferred embodiment, the methods comprise combining a lung cancer 

1 5 protein and a candidate compound, and determining the binding of ttie compound to the lung 

cancer protein. Preferred embodiments utilize the human lung cancer protein, although other 

mammalian proteins may also be used, e.g., for the development of animal models of human 

disease. In some embodiments, as outlined herein, variant or derivative lung cancer proteins 

may be used. 

20 Generally, in a preferred embodiment of the methods herein, the lung cancer protein 

or the candidate agent is non-diffiisably bound to an insoluble support, preferably having 
isolated sample receiving areas (e.g., a microtiter plate, an array, etc.). The insoluble 
supports may be made of a composition to which the compositions can be bound, is readily 
separated from soluble material, and is otherwise compatible with the overall method of 

25 screening. The surface of such supports may be soUd or porous and of a convenient shape. 
Examples of suitable insoluble supports include microtiter plates, arrays, membranes and 
beads. These are typically made of glass, plastic (e.g., polystyrene), polysaccharides, nylon 
or nitrocellulose, teflon™, etc. Microtiter plates and arrays are especially convenient because 
a large number of assays can be carried out simultaneously, using small amounts of reagents 

30 and samples. The particular maimer of binding of the composition is typically not crucial so 
long as it is compatible with the reagents and overall methods of the invention, maintains the 
activity of the composition, and is nondiffusable. Preferred methods of binding include the 
use of antibodies (which do not sterically block eith^ the ligand binding site or activation 



64 



wo 02/086443 PCT/US02/12476 
sequCTice when the protein is bound to the support), direct binding to "sticky" or ionic 

supports, chemical crosslinking, the synthesis of the protein or agent on the surface, etc. 
Following binding of the protein or agent, excess unbound material is removed by washing. 
The sample receiving areas may then be blocked through incubation with bovine serum 
5 albumin (BSA), casein or other innocuous protein or other moiety. 

In a preferred embodiment, the lung cancer protein is bound to the support, and a test 
compound is added to the assay. Alternatively, the candidate agent is bound to the support 
and the lung cancer proteui is added. Novel binding agents include specific antibodies, non- 
natural binding agents identified in screens of chemical hbraries, peptide analogs, etc. Of 

10 particular int^est are screening assays for agents that have a low toxicity for human cells. A 
wide variety of assays may be used for this purpose, including labeled in vitro protein-protem 
binding assays, electrophoretic mobility shift assays, immunoassays for protein binding, 
ftmctional assays (phosphoiylation assays, etc.) and the like. 

The determination of the binding of the test modulating compound to the lung cancer 

IS protein may be done in a number of ways. In a preferred embodimrat, the compound is 

labeled, and binding determined directly, e.g., by attaching all or a portion of the lung cancer 
protein to a solid support, adding a labeled candidate agent (e.g., a fluorescent label), washing 
off excess reagent, and determining whether the label is present on the solid support Various 
blocking and washing steps may be utilized as appropriate. 

20 In some embodiments, only one of the components is labeled, e.g., the proteins (or 

proteinaceous candidate compounds) can be labeled. Alternatively, more than one 
component can be labeled with different labels, e.g., ^^I for the proteins and a fluorophor for 
the compound. Proximity reagents, e.g., quenching or energy transfer reagents are also 
useful. 

25 In one embodiment, the binding of the test compound is determined by competitive 

binding assay. The competitor may be a binding moiety known to bind to the target molecule 
(i.e., a lung cancer protein), such as an antibody, peptide, binding partner, ligand, etc. Under 
certain circumstances, there may be competitive binding between the compound and the 
binding moiety, with the binding moiety displacing the compoimd. In one embodiment, the 

30 test compound is labeled. Either the compound, or the competitor, or both, is added first to 
the protein for a time sufficient to allow binding, if present. Incubations may be performed at 
a temperature which faciUtates optimal activity, typically between 4 and 40° C. Incubation 
periods are typically optimized, e.g., to facihtate rapid high throughput screening. Typically 
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between 0.1 and 1 hour will be sufficient. Excess reagent is generally removed or washed 

away. The second component is then added, and the presence or absence of the labeled 

conq)onent is followed, to indicate binding. 

In a preferred embodiment, the competitor is added first, followed by a test 

5 compound Displacement of the competitor is an indication that the test compound is binding 

to the limg cancer protein and thus is capable of binding to, and potentially modulating, the 

activity of the lung cancer protein. In this embodiment, dfher componeut can be labeled. 

Thus, e.g., if the competitor is labeled, the presence of label in the wash solution indicates 

displacement by the agent. Alternatively, if the test compound is labeled, the presence of the 

1 0 label on the support indicates displacement. 

In an altemative embodiment, the test compound is added first, with uicubation and 
washing, followed by the corKq)etitor. The absence of binding by tiie competitor may indicate 
that the test compound is boimd to the lung cancer protein with a higher affinity. Thus, if the 
test compound is labeled, the presence of tihie label on the siq)port, coupled witii a lack of - 

1 5 coniypetitor binding, may indicate that the test compoimd is capable of binding to the lung 
cancer protein. 

In a prefmed embodiment, the methods comprise differential screening to identity 
agents that are arable of modulatmg the activity of the lung cancer jproteins. In one 
embodiment, the methods comprise combining a lung cancer protein and a conq)etitor in a 

20 first sample. A second sample comprises a test compound, a lung cancer protem, and a 

competitor. The binding of the competitor is determined for both samples, and a change, or 
difference in binding between the two samples indicates the presence of an agent capable of 
binding to the Ixmg cancer protein and potentially modulating its activity. That is, if the 
binding of the competitor is different in the second sample relative to the first sample, the 

25 agent is capable of binding to the lung cancer protein. 

Alternatively, differential screening is used to identify drug candidates that bind to the 
native lung cancer protein, but cannot bind to modified limg cancer proteins. The structure of 
the lung cancer protein may be modeled, and used in rational drug design to synthesize agents 
that interact with that site. Drug candidates that affect the activity of a lung cancer protein 

30 are also identified by screening drugs for the ability to either enhance or reduce the activity of 
tibie protein. 

Positive controls and negative controls may be used in the assays. Preferably control 
and test samples are performed in at least triplicate to obtain statistically significant results. 
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Incubation of all samples is for a time sufScient for the binding of the agent to the protein. 
Following incubation, samples are washed free of non-specifically bound material and the 
amount of bound, generally labeled agent determined. For example, where a radiolabel is 
employed, the samples may be counted in a scintillation coimter to determine the amount of 
5 bound compound. 

A variety of other reagents may be included in tiie screening assays. These include 
reagents like salts, neutral proteins, e.g., albumin, detergents, etc. which may be used to 
facilitate optimal protein-protein binding and/or reduce non-speciiSc or backgroimd 
interactions. Also reagents that otherwise iny)n>ve the efficiency of the assay, such as 
10 protease inhibitors, nuclease inhibitors, anti-microbial agents, etc., may be used. The mixture 
of components may be added in an order that provides for the requisite binding. 

In a preferred embodiment, the invention provides methods for screening for a 
compound enable of modulating the activity of a lung cancer protein. The methods 
comprise adding a test compound, as defined above, to a cell comprising lung cancer 
15 protems. Prefrared cell types include ahnost any cell. The cells contain a recombinant 
nucleic acid that encodes a lung cancer protein. In a preferred embodiment, a library of 
candidate agents are tested on a plurality of cells. 

In one aspect, the assays are evaluated in the presence or absence or previous or 
subsequent exposure of physiological signals, e.g., hormones, antibodies, peptides, antigens, 
20 cytokines, growth factors, action potentials, pharmacological agents including 

chemotherapeutics, radiation, carcinogeiiics, or other cells (e.g., cell-cell contacts). In another 
example, the determinations are determined at different stages of the cell cycle process. 

In this way, compounds that modulate lung cancer agents are identified Compoimds 
with pharmacological activity are able to enhance or interfere with the activity of the lung 
25 cancer protein. Once identified, similar stmctures are evaluated to identify critical structural 
feature of the compound. 

In one embodiment, a method of inhibiting lung cancer cell division is provided. The 
method comprises administration of a lung cancer inhibitor. In another embodiment, a 
method of inhibiting lung cancer is provided. The method may comprise administration of a 
30 lung cancer inhibitor. In a further embodiment, methods of treating cells or individuals with 
lung cancer are provided, e.g., comprising administration of a limg cancer inhibitor. 

In one embodiment, a lung cancer inhibitor is an antibody as discussed above. In 
another embodiment, the lung cancer inhibitor is an antisense molecule. 
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A variety of cell growth, proliferation, viability, and metastasis assays are known to 
ttiose of skill in the art, as described below. 

Soft agar growth or colony formation in suspension 
5 Normal cells require a solid substrate to attach and grow. When the cells are 

transformed, they lose this phenotype and grow detached from the substrate. For example, 
transformed cells can grow in stirred suspension culture or suspended in semi-solid media, 
such as semi-solid or soft agar. The transformed cells, when transfected with tumor 
suppressor genes, regenerate normal phenotype and require a solid substrate to attach and 

10 grow. Soft agar growth or colony formation in suspension assays can be used to identify 
modulators of Ixmg cancer sequences, which when e3q)ressed in host cells, inhibit abnormal 
cellular proliferation and transformation. A dierapeutic compound would reduce or eliminate 
the host cells' ability to grow in stirred suspension culture or suspended in semi-solid media, 
such as semi-solid or soft. 

1 s Techniques for soft agar growth or colony formation in suspension assays are 

described in Freshney (1994) r>i ^fiii^ nf Animal Tells a Manual of P flgi'c Technique (3"^ ei), 
herein incorporated by reference. See also, the methods section of Garkavtsev, et al. (1996), 
svpra^ herein incorporated by reference. 

20 Contact inhibition and density limitation of growth 

Normal cells typically grow in a flat and organized pattern in a petri dish until they 
touch other cells. When the cells touch one another, they are contact inhibited and stop 
growing. When cells are transformed, however, the cells are not contact inhibited and 
continue to grow to high densities in disorganized foci. Thus, the transformed cells grow to a 

25 higher saturation density than normal cells. This can be detected morphologically by the 
formation of a disoriented monolayer of cells or rounded cells in foci within the regular 
pattern of normal surrounding cells. Alternatively, labeling index with ("^-thymidine at 
saturation density can be used to measure density limitation of growth. See Freshney (1994), 
supra. The transformed cells, when transfected with tumor suppressor genes, regenerate a 

30 normal phenotype and become contact inhibited and would grow to a lower density. 

In this assay, labeling index with (^H)-thyinidine at saturation density is a preferred 
method of measuring density limitation of growth. Transformed host cells are transfected 
with a lung cancer-associated sequence and are grown for 24 hours at saturation density in 
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non-limiting medium conditions. The percentage of cells labeling with (^H)-thymidine is 
deteimmed autoradiographically. See, Freshney (1994), si^ra. 

Growth factor or serum dependence 

5 Transfonned cells typicaUy have a lower serum dq>endence than their normal 

coimterparts (see, e-g.^ Temin (1966) I Natl. Cancer Tnsti. 37:167-175; Eagle, et al. (1970) L 
Exp. Aled. 131:836-879); Freshney, supra. This is in part due to release of various growth 
fectors by the transformed cells. Growth factor or serum depeadence of transformed host 
cells can be compared with that of control. 

10 

Tumor specific markers levels 

Tumor cells release an increased amount of certain fectors (hereinafter *tumor 
specific markers") than their normal counterparts. For example, plasminogen activator (PA) 
is released fit)m human glioma at a higher level than fi-om normal brain cells (see, e.g., 
Gullino, "Angiogenesis, tumor vascularization, and potential interference with tumor growth" 
in Mihich (ed. 1985) Biological R esponses in Cancer, pp. 178-184). Similarly, Tumor 
angiogenesis fector (TAF) is released at a higher level m tumor cells than their normal 
counterparts. See, e.g., Folkman (1992) "Angiogenesis and Cancer" in Sem Cancer BioLV 

Various techniques which measure the release of these factors are described in 
Freshney (1994), svpra. Also, see, Unkeless, et al. (1974) J. Biol. Chem. 249:4295-4305; 
Strickland and Beers (1976) J. BioL Chem . 251:5694-5702; Whur, et al. (1980) Br. J. Cancer 
42:305-312; Gullino, "Angiogenesis, tumor vascularization, and potential interference witii 
tumor growth" in Mihich (ed. 1985) Biological Responses in Cancer, pp. 178-184; Freshney 
^AnticancerR^ 5:111-130 (1985). 

Invasiveness into Matrigel 

The degree of invasiveness into Matrigel or some otiier extracellular matrix 
constituent can be used as an assay to identify compounds that modulate lung cancer- 
associated sequences. Tumor cells exhibit a good correlation between malignancy and 
invasiveness of cells into Matrigel or some otiier extracellular matrix constituent. In this 
assay, tumorigenic cells are typically used as host cells. Expression of a tumor suppressor 
gene in these host cells would decrease invasiveness of the host cells. 
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Techniques described in Freshney (1994), supra, can be used. Briefly, the level of 

invasion of host cells can be measured by using filters coated with Matrigel or some other 

e?ctracellular matrix constituent Penetration into the gel, or through to the distal side of the 

filter, is rated as invasiveness, and rated histologically by number of cells and distance 

5 moved, or by prelabeling flie cells with ^^I and counting the radioactivity on the distal side of 

the filter or bottom of the dish. See, e.g., Freshney (1984), supra. 

Tumor growth in vivo 

Effects of lung cancer-associated sequences on cell growth can be tested in transgenic 

10 or immune-siq>pressed mice. Knock-out transgenic mice can be made, in which tiie lung 
cancer gene is disrupted or in which a lung cancer gene is inserted. Knock-out transgenic 
mice can be made by insertion of a maiker gme or other heterologous gene into the 
endogenous lung cancer gene site in the mouse genome via homologous recombination. 
Such mice can also be made by substituting fhe endogenous lung cancer gene with a mutated 

IS version of the lung cancer gene, or by mutating the endogenous lung cancer gene, e.g., by 
e?q)osure to carcinogens. 

A DNA construct is introduced into flie nuclei of embryonic stem cells. Cells 
containing the newly engineered genetic lesion are injected into a host mouse embryo, which 
is re-implanted into a recipient female. Some of these embryos develop into chimeric mice 

20 that possess germ cells partially derived firom the mutant cell line. Therefore, by breeding the 
chimeric mice it is possible to obtain a new line of mice containing the introduced genetic 
lesion (see, e.g., Capecchi, et al. (1989) Science 244:1288). Chimeric targeted mice can be 
derived according to Hogan, et al. (1988) Manipulating the Mouse Embrvo: A Laboratory 
Manual. Cold Spring Harbor Laboratory and Robertson (ed. 1987) Teratocarcinomas and 

25 Embrvonic Stem Cells: A Practical AnnroacL , IRL Press, Washington, D.C. 

Altematively, various immune-suppressed or immune-deficient host animals can be 
used. For example, genetically athymic "nude" mouse (see, e.g., Giovanella, et al. (1974) J. 
Natl. Cancer Inst 52:921), a SCID mouse, a thymectomized mouse, or an irradiated mouse 
(see, e.g., Bradley, et al. (1978) Br. J. Cancer 38:263; Selby, et al. (1980) Br. J. Cancer 41:52) 

30 can be used as a host. Transplantable tumor cells (typically about 10^ cells) injected into 

isogenic hosts will produce invasive tumors in a high proportions of cases, while normal cells 
of similar origin will not. In hosts which developed invasive tumors, cells expressing a lung 
cancer-associated sequences are injected subcutaneously. After a suitable length of time. 
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preferably 4-8 weeks, tumor growth is measured (e.g., by volume or by its two largest 
dimensioBs) and compared to the coiitrol. Tumors that have statistically significant reduction 
(using, e.g.. Student's T test) are said to have inhibited growfit 

5 Polynucleotide modulators of lung cancer 

Antisense andRNAi Polynucleotides 

In certain embodiments, the activity of a lung cancer-associated protein is 
dowm:egulated, or entirely inhibited, by the use of antisense or an inhibitory polynucleotide, 
i.e., a nucleic acid complementary to, and which can preferably hybridize specifically to, a 

10 coding mRNA nucleic add sequence, e.g., a lung cancer protein mSNA, or a subsequence 
thereof. Binding of the antisense polynucleotide to the mSNA reduces the translation and/or 
stability of the mRNA. 

In the context of this invention, antisense polynucleotides can comprise naturally- 
occinring nucleotides, or synthetic species formed Scorn naturally-occurring subunits or their 

15 close homologs. Antisense polynucleotides may also have altered sugar moietiesi or inter- 
sugar Unkages. Exemplary among these are the phosphorothioate and other sulfiir containing 
species which are known for use in the art Analogs are comprehended by this invention so 
long as they Amotion effectively to hybridize with the lung cancer protein mKNA. See, e.g., 
Isis Pharmaceuticals, Carlsbad, CA; Sequitor, Inc., Natick, MA 

20 Such antisense polynucleotides can readily be synthesized using recombinant means, 

or can be synthesized in vitro. Equipment for such synthesis is sold by several vendors, 
including Applied Biosystems. The preparation of other oligonucleotides such as 
phosphorothioates and alkylated derivatives is also well known to those of skill in the art. . 
Antisense molecules as used herein include antisense or sense oligonucleotides. 

25 Sense oligonucleotides can, e.g., be employed to block transcription by binding to the anti- 
sense strand. The antisense and sense oligonucleotide comprise a single-stranded nucleic 
acid sequence (either RNA or DNA) capable of bindmg to target mRNA (sense) or DNA 
(antisense) sequences for lung cancer molecules. A preferred antisense molecule is for a lung 
cancer sequence in the tables, or for a ligand or activator thereof Antisense or sense 

30 oligonucleotides, according to the present invention, comprise a firagment generally at least 
about 14 nucleotides, preferably fi-om about 14 to 30 nucleotides. The ability to derive an 
antisense or a sense oligonucleotide, based upon a cDNA sequence encoding a given protein 
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is described in, e.g.. Stein and Cohen (1988) Cancer Res. 48:2659 and van der Krol, et al. 
(1988) BioTechnioues 6:958). 

RNA interference is a mechanism to suppress gene expression in a sequence specific 
manner. See, e.g., Brumelkamp, et al. (2002 ^ Sciencexpress (21March2002); Sharp (1999) 
5 Genes Dev. 13:139-141; and Cathew (2001) Curr.Op. CeUBiol. 13:244-248. In mammalian 
cells, short, e.g., 21 nt, double stranded small interfering RNAs (siRNA) have been shown to 
be effective at inducing an RNAi response. See, e.g., Elbashir, et al. (2001) Nature 411 :494- 
498. The mechanism may be used to downregulate e}q>ression levels of identified genes, e.g., 
treatment of or validation of relevance to disease. 

10 

Ribozymes 

In addition to antisense polynucleotides, ribozymes can be used to target and inhibit 
transcription of lung cancer-associated nucleotide sequences. A ribozyme is an RNA 
molecule that catalytically cleaves other RNA molecules. Different kinds of ribozymes have 

1 5 been described, including group I ribozymes, hammeriiead ribozymes, hairpin ribozymes, 
RNase P, and axhead ribozymes (see, e.g., Castanotto, et al. (1994) Adv. in Pharmacology 
25: 289-317 for a general review of the properties of different ribozymes). 

The general features of hairpin ribozymes are described, e.g., in Hampel, et al. (1990) 
Nucl. Acids Res. 18:299-304; European Patent Publication No. 0360 257; U.S. Patent No. 

20 5,254,678. Methods of preparing are well known to those of skill in the art (see, e.g., WO 
94/26877; Ojwang, et al. (1993) Proc. Natl. Acad. Sci. USA 90:6340-6344; Yamada, et al. 
(1994) Human Gene Therapy 1:39-45; Leavitt, et al. (1995) Proc. Natl Ac ad. Sci, USA 
92:699-703; Leavitt, et al. (19994) Human Gene Theranv 5:1 151-120; and Yamada, et al. 
(1994) Virologj: 205: 121-126). 

25 Polynucleotide modulators of lung cancer may be introduced into a cell containing the 

target nucleotide sequence by formation of a conjugate with a ligand binding molecule, as 
described in WO 91/04753. Suitable ligand binding molecules include, but are not limited to, 
cell surface receptors, growth factors, other cytokines, or other Ugands that bind to cell 
surface receptors. Preferably, conjugation of the Ugand binding molecule does not 

30 substantially interfere with the ability of the ligand binding molecule to bind to its 

corresponding molecule or receptor, or block entry of the sense or antisense oUgonucleotide 
or its conjugated version into the cell. Alternatively, a polynucleotide modulator of lung 
cancer may be introduced into a cell containing the target nucleic acid sequence, e.g., by 
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formation of an polynucleotide-lipid complex, as described in WO 90/10448. It is 

imderstood that the use of antisense molecules or knock out and knock in models may also be 

used in screening assays as discussed above, in addition to methods of treatment. 

Thus, in one embodiment, methods of modulating lung cancer in cells or organisms 

5 are provided. In one embodiment, the methods comprise administering to a cell an anti-lung 
cancer antibody that reduces or eliminates the biological activity of an endogenous lung 
cancer protein. Alternatively, the methods conq>rise administering to a cell or organism a 
recombinant nucleic acid encoding a lung cancer protein. This may be acconiplished in any 
number of ways. In a preferred embodiment, e.g., when the lung cancer sequence is down- 

1 0 regulated in lung cancer, such state may be reversed by increasing the amount of lung cancer 
gene product m the cell. This can be accomplished, e.g., by overexpressing the endogenous 
lung cancer gene or administering a gene encoding the lung cancer sequence, using known 
gene-therapy techniques. In a preferred embodiment, the gene therapy techniques include the 
incorporation of the exogenous gene using enhanced homologous recombination (EHR), e.g., 

15 as described in PCTAJS93/03868, hereby incorporated by reference in its entirety. 

Alternatively, e.g., when the lung cancer sequence is up-regulated in lung cancer, flie activity 
of the endogenous lung cancer gene is decreased, e.g., by the administration of a lung cancer 
antisense or RNAi nucleic acid. 

In one embodiment, the lung cancer proteins of the present invention may be used to 

20 generate polyclonal and monoclonal antibodies to lung cancer protdns. Similarly, flie lung 
cancer proteins can be coupled, using standard technology, to aflSnity chromatography 
columns. These colimms niiay then be used to purify lung cancer antibodies usefiil for 
production, diagnostic, or therapeutic purposes. In a preferred embodiment, the antibodies 
are generated to epitopes unique to a limg cancer protein; that is, the antibodies show little or 

25 no cross-reactivity to other proteins. The limg cancer antibodies may be coupled to standard 
affinity chromatography columns and used to purify lung cancer proteins. The antibodies 
may also be used as blocking polypeptides, as outlined above, since they will specifically 
bind to the lung cancer protein. 

30 Methods of identifjing variant lung cancer-associated sequences 

Without being bound by theory, expression of various lung cancer sequences is 
correlated with lung cancer. Accordingly, disorders based on mutant or variant lung cancer 
genes may be determined. In one embodiment, the invention provides methods for 
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identifying cells containing variant lung cancer genes, e.g., determining all or part of the 
sequence of at least one endogenous lung cancer genes in a cell. Li a preferred embodiment, 
the invention provides methods of identifying the lung cancer genotype of an individual, e.g., 
detennining all or part of the sequence of at least one lung cancer gme of the individual. 
5 This is generally done in at least one tissue of the individual, and may include the evaluation 
of a nimiber of tissues or different samples of the same tissue. The method may include 
comparing the sequence of the sequenced lung cancer gene to a known lung cancer gene, i.e., 
a wild-type gene. 

The sequence of all or part of the lung cancer gene can then be compared to the 
10 sequence of a known lung cancer gene to detemiine if any differences exist This can be 
done using known homology programs, such as Bestfit, etc. In a preferred embodiment, the 
presence of a difference in the sequence between the lung cancer gene of the patient and the 
known lung cancer gene correlates with a disease state or a propensity for a disease state, as 
outUned herein. 

15 £a a preferred embodiment, the lung cancer genes are used as probes to determine the 

numbo: of copies of ttie lung cancer gene in the genome. 

In anoflier preferred embodiment, the lung cancer genes are used as probes to 

determine tiie chromosomal localization of the lung cancer genes. Information such as 

chromosomal localization finds use in providing a diagnosis or prognosis in particular when 
20 chromosomal abnormalities such as translocations, and the like are identified in the lung 

cancer gene locus. 

Administration of pharmaceutical and vaccine compositions 

In one embodiment, a therapeutically effective dose of a lung cancer protein or 
25 modulator thereof, is administered to a patient. By "therapeutically effective dose" herein is 
meant a dose that produces effects for which it is administered. The exact dose will depend 
on the purpose of the treatment, and will be ascertainable by one skilled in the art using 
known techniques (e.g., Ansel, et al. (1992) Pharmaceutical Dosage Forms and Dmg 
Delivery: Lieberman, Pharmaceutical Dosage Forms (vols. 1-3), Dekker, ISBN 0824770846, 
30 082476918X, 0824712692, 0824716981; Lloyd (1999) The Art. Scien ce and Technologv of 
Pharmaceutical Compounding : and Pickar (1999^ Dosage CalculationsV Adjustments for 
lung cancer degradation, systemic versus localized delivery, and rate of new protease 
synthesis, as well as the age, body weight, general health, sex, diet, time of administration, 
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drug interaction and the severity of the condition may be necessary, and will be ascertainable 

with routine experimentation by those skilled in the art. 

A ''patientf' for the purposes of the present invention includes both humans and other 
animals, particularly mammals. Thus the methods are applicable to both human therapy and 
veterinary implications. In the preferred embodiment the patient is a mammal^ preferably a 
primate, and in the most preferred embodiment the patient is human. 

The administration of the lung cancer proteins and modulators thereof of the present 
invention can be done in a variety of ways, including, but not limited to, orally, 
subcutaneously, intravenously, intranasally, transdermally, intraperitoneally, intramuscularly, 
intr^uhnonary, vaginally, rectally, or intraocularly. In some instances, e.g., in the treatment 
of wounds and inflammation, the lung cancer proteins and modulators may be directly 
applied as a solution or spray. 

The pharmaceutical compositions of the present invention comprise a lung cancer 
protein in a form suitable for administration to a patient. In the preferred embodiment, the 
pharmaceutical compositions are in a water soluble fonn, such as being present as 
pharmaceutically acceptable salts, which is meant to include both acid and base addition 
salts. 'Tharmaceutically acceptable acid addition salf ' refers to those salts that retain the 
biological effectiveness of the free bases and that are not biologically or otherwise 
undesirable, formed with inorganic acids such as hydrochloric acid, hydrobiomic acid, 
sulfuric acid, nitric acid, phosphoric acid and the like, and organic adds such as acetic acid, 
propionic acid, glycolic acid, pyruvic acid, oxalic acid, maleic acid, malonic add, succinic 
acid, fumaric acid, tartaric acid, citric add, benzoic acid, cinnamic add, mandelic acid, 
methanesulfonic add, etfaanesulfonic acid, p-toluenesulfonic acid, salicylic acid and the like. 
••Pharmaceutically acceptable base addition salts" include those derived from inorganic bases 
such as sodium, potassium, lithium, ammonium, calcixmi, magnesium, iron, zinc, copper, 
manganese, aluminum salts and the like. Particularly preferred are the ammonium, 
potassium, sodium, calcium, and magnesium salts. Salts derived from pharmaceutically 
acceptable organic non-toxic bases include salts of primary, secondary, and tertiary amines, 
substituted amines including naturally occurring substituted amines, cyclic amines and basic 
ion exchange resins, such as isopropylamine, trimethylamine, diethylamine, triethylamine, 
tripropylamine, and ethanolamine. 

The pharmaceutical compositions may also include one or more of the following: 
carrier proteins such as serum albumin; buffers; fillers such as microcrystalline cellulose. 
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lactose, com and other starches; binding agents; sweeteners and other flavoring agents; 
coloring agents; and polyethylene glycol. 

The pharmaceutical compositions can be administered in a variety of unit dosage 
forms depending upon the method of administration. For example, unit dosage forms 
5 suitable for oral administration include, but are not limited to, powder, tablets, pills, capsules 
and lozenges. It is recognized that lung cancer protein modulators (e.g., antibodies, antisense 
constructs, ribozymes, small organic molecules, etc.) when administered orally, should be 
protected from digestion. This is typically accomplished eiflier by complexing the 
molecule(s) wifli a composition to rend^ it resistant to acidic and enzymatic hydrolysis, or by 

10 packaging the molecule(s) in an s^propriately resistant carrier, such as a liposome or a 
protection barrier. Means of protecting agents from digestion are well known in the art 

The compositions for administration will commonly comprise a lung cancer protein 
modulator dissolved in a phaimaceutically acceptable carrier, prefaably an aqueous carri^. 
A variety of aqueous carriers can be used, e.g., buffered saline and the like. These solutions 

IS are sterile and generally free of undesirable matter. These compositions may be sterilized by 
conventional, well known sterilization techniques. The compositions may contain 
piharmaceutically acceptable auxiUary substances as required to approximate physiological 
conditions such as pH adjusting and buffering agents, toxicity adjustmg ^ents and the hke, 
e.g., sodium acetate, sodium chloride, potassium chloride, calcium chloride, sodium lactate 

20 and die like. The concentration ofactive agent in these formulations can vary widely, and 
will be selected primarily based on fluid volumes, viscosities, body weight and the like in 
accordance with the particular mode of administration selected and the patient's needs (e.g.. 
Remington's Pharmaceutical Science (15th ed., 1980) and Hardman, et al. (eds. 1996) 
Goodman and Oilman: The Pharmacologial Basis of TherapeuticsV 

25 Thus, a typical pharmaceutical composition for intravenous administration would be 

about 0.1 to 10 mg per patient per day. Dosages from 0.1 up to about 100 mg per patient per 
day may be used, particularly when the drug is administered to a secluded site and not into 
the blood stream, such as into a body cavity or into a lumen of an organ. Substantially higher 
dosages are possible in topical administration. Actual methods for preparing parenterally 

30 administrable compositions will be known or apparent to those skilled in the art, e.g., 

Remington's Pharmaceutical Science and Goodman and Oilman, The Pharmacologial Basis 
ofTherapeutics^ supra. 
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The compositions containing modulators of Imig cancer proteins can be achninistered 
for therapeutic or prophylactic treatments. In thenQ)eutic plications, compositions are 
administered to a patient suiSering finom a disease (e.g., a cancer) in an amount sufficient to 
cure or at least partially arrest the disease and its complications. An amount adequate to 
accomplish this is defined as a ''therapeutically effective dose." Amounts effective for this 
use will dq)end iqjon the severity of Ihe disease and the genoral state of the patient's health. 
Single or multiple administrations of the compositions may be administered depending on the 
dosage and fiiequency as required and tolerated by tiie patient In any event, the composition 
should provide a sufficient quantity of the agents of this invention to effectively treat the 
patient. An amount of modulator that is capable of preventing or slowing the development of 
cancer in a mammal is referred to as a *'pn>phylactically effective dose." The particular dose 
required for a prophylactic treatment will depend upon the medical condition and history of 
the maimnal, the particular cancer being prevented, as well as other factors such as age, 
weigjit, gender, administration route, efficiency, etc. Such prophylactic treatments may be 
used, e.g., in a mammal who has previously had cancer Jo prevent a rccurrence of ttie cancer, 
or in a mammal who is suspected of having a significant likelihood of developing cancer 
based, at least in part, iq)on gene expression profiles. Vaccine strategies may be used, in 
either a DNA vaccine form, or protein vaccine. 

It will be appreciated that the present lung cancer protein-modulating compounds can 
be administered alone or in combination with additional lung cancer modulating compounds 
or with other therapeutic agent, e.g., other anti-cancer agents or treatments. 

In numerous embodiments, one or more nucleic acids, e.g., polynucleotides 
comprising nucleic acid sequences set forth in the tables, such as antisense or RNAi 
polynucleotides or ribozymes, will be introduced into cells, in vitro or in vivo. The present 
invention provides methods, reagents, vectors, and cells useful for expression of lung cancer- 
associated polypeptides and nucleic acids using in vitro (cell-fi*ee), ex vivo, or in vivo (cell or 
organism-based) recombinant expression systems. 

The particular procedure used to introduce the nucleic acids into a host cell for 
expression of a protein or nucleic acid is application specific. Many procedures for 
introducing foreign nucleotide sequences into host cells may be used. These include the use 
of calcium phosphate transfection, spheroplasts, electroporation, liposomes, microinjection, 
plasma vectors, viral vectors and other well known methods for introducing cloned genomic 
DNA, cDNA, synthetic DNA or other foreign genetic material into a host cell (see, e.g.. 
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Berger and Kinunel, Guide to Molecular Cloning Techniques, Methods in Enzvmology 

volume 152 (Berger), Ausubel, et al. (eds. 1999) Current Protocols (supplemented through 

1999), and Sambrook, et al. (1989) Molecular Clomng - A Laboratory Manual (2nd ed.. Vol. 

1-3). 

5 In a preferred embodiment, lung cancer proteins and modulators are administered as 

therapeutic agents, and can be formulated as outlined above. Similarly, lung cancer genes 
(including both tiie fiill-lengtfa sequence, partial sequences, or regulatory sequences of the 
limg cancer coding regions) can be administered in a gene therapy application. These lung 
cancer genes can mclude antisense or inhibitory applications, e.g., as inhibitory RNA or gene 

10 therapy (e.g., for incorporation into the genome) or as antisense compositions. 

Lung cancer polypeptides and polynucleotides can also be administered as vaccine 
compositions to stimulate HTL, CTL, and antibody responses.. Such vaccine compositions 
can include, e.g., lipidated peptides (see, e,g.,Vitiello, et al. (1995) J. Clm. Invest. 95:341), 
peptide compositions enc^sulated in poly(DL-lactide-co-glycolide) C'PLG*') microspheres 

15 (see, e.g., Eldridge, et al. (1991) Molec. Immunol. 28:287-294; Alonso, et al. (1994) Vaccine 
12:299-306; Jones, et al. (1995) Vaccine 13:675-681), pq)tide compositions contained in 
immune stimulating complexes (ISCOMS) (see, e.g., Takahashi, et al. (1990) Nature 
344:873-875; Hu, et al. (1998) rii'n Ryp Tmmunol. 113:235-243), multiple antigen peptide 
systems (MAPs) (see, e.g., Tarn (1988) Proc. Natl. Acad. Sci. U.S A. 85:5409-5413; Tarn 

20 (1 996) J. Immunol. Methods 1 96: 1 7-32), peptides formulated as multivalent peptides; 
peptides for use in ballistic delivery systems, typically crystallized peptides, viral delivery 
vectors (Perkus, et al., p. 379 In: Kaufinann (ed. 1996) Concepts in vaccine development: 
Chakrabarti, et al. (1986) Nature 320:535; Hu, et al, (1986) Nature 320:537; Kieny, et al. 
(1986) AIDS Bio/Technoloev 4:790; Top, et al. (1971) J. Infect. Pis. 124:148; Chanda, et al. 

25 (1990) Virologv 175:535), particles of viral or synthetic origin (see, e.g., Kofler, et al. (1996) 
J. TmmunoL Methodf; 192:25; Eldridge, et al. (1993) Sem. Hematol. 30:16; Falo, et al. (1995) 
Nature Med. 7:649), adjuvants (Warren, et al. (1986) Atitih T^ev Tm munoL 4:369; Gupta, et 
al. (1993) Vaccine 11:293), Uposomes (Reddy, et al. (1992) J. Immunol. 148:1585; Rock 
(1996) ImmunoL Todav 17:131), or, naked or particle absorbed cDNA (Uhner, et al. (1993) 

30 Science 259:1745; Robinson, et al. (1993) Vaccine 1 1:957; Shiver, et al., p. 423 In: 

Kaufinann (ed. 1996) Concepts in vaccine development: Cease and Berzofsky (1994) Annu. 
Pftv TmniiiTinl 12:923 and Eldridge, et al. (1993) Sem. Hematol. 30:16). Toxin-targeted 
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delivery technologies, also known as receptor mediated targeting, such as tiiose of Avant 
Immunotherapeutics, Inc. (Needham, Massachusetts) may also be used. 

Vaccine compositions often include adjuvants. Many adjuvants contain a substance 
designed to protect the antigen Gxym rapid catabolism, such as alunMnum hydroxide or 

5 mineral oil, arid a stimulator of immune responses, such as lipid A, Boriadella pertussis or 
Mycobacterium tuberculosis derived proteins. Certain adjuvants are commercially available 
as, e.g., Freund^s Incomplete Adjuvant and Complete Adjuvant (pifco Laboratories, Detroit, 
MI); Merck Adjuvant 65 (Merck and Company, lac, Rahway, NJ); AS-2 (SmithKline 
Beecham, Philadelphia, PA); aluminum salts such as aluminum hydroxide gel (alum) or 

10 aluminum phosphate; salts of calcium, iron or zinc; an insoluble suspension of acylated 
tyrosine; acylated sugars; cationically or anionically derivatized polysaccharides; 
polyphosphazenes; biodegradable microspheres; monophosphoryl lipid A and quil A. 
Cytokines, such as GM-CSF, interleukm-2, -7, -12, and other like growth factors, may also be 
used as adjuvants. 

1 5 Vaccmes can be administered as nucleic acid compositions wherein DNA or RNA 

encoding one or more of the polypeptides, or a fragment thereof, is administered to a patient 
This approach is described, for instance, in Wolff, et. al. (1990) Science 247:1465 as well as 
U.S. Patent Nos. 5,580,859; 5,589,466; 5,804,566; 5,739,118; 5,736,524; 5,679,647; WO 
98/04720; and in more detail below. Examples of DNA-based delivery technologies include 

20 ^'naked DNA", facilitated (bupivicaine, polymers, peptide-mediated) delivery, cationic lipid 
complexes, and particle-mediated ("gene gun") or pressure-mediated delivery (see, e.g., U.S. 
Patent No. 5,922,687). 

For therapeutic or prophylactic immunization purposes, the peptides of the invention 
can be expressed by viral or bacterial vectors. Examples of e^qjression vectors include 

25 attenuated viral hosts, such as vaccinia or fowlpox. This approach involves the use of 
vaccinia virus, e.g., as a vector to express nucleotide sequences that encode lung cancer 
polypeptides or polypeptide fragments. Upon introduction into a host, the recombinant 
vaccinia virus expresses the immunogenic peptide, and thereby elicits an immune response. 
Vaccinia vectors and methods useftd in immunization protocols are described in, e.g., U.S. 

30 Patent No. 4,722,848. Another vector is BCG (Bacille Cabnette Guerin). BCG vectors are 
described in Stover, et al, (1991) Nature 351:456-460. A wide variety of other vectors usefiil 
for therapeutic administration or immunization e.g., adeno and adeno-associated virus 
vectors, retroviral vectors, Salmonella typhi vectors, detoxified anthrax toxin vectors, and the 
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like, will be apparent to those skilled in the art &om the description herein (see, e.g., Shata, et 

al. (2000) Mol Med Today 6:66-71; Shedlock, et al. (2000) J. Leukoc. Biol. 68:793-806; 

Hipp, et al. (2000) In Vivo 14:571-85). 

Methods for the use of genes as DNA vaccines are well known, and include placing a 

5 lung cancer gene or portion of a lung cancer gene under the control of a regulatable promoter 

or a tissue-specific promoter for e:q>ression in a lung cancer patient. The lung cancer gene 

used for DNA vaccines can encode full-lragth lung cancer proteins, but more preferably 

encodes portions of the lung cancer proteins including pq>tides derived fix>m the lung canc^ 

protein. In one embodiment, a patient is immunized with a DNA vaccine comprising a 

10 plurality of nucleotide sequences derived &om a lung cancer gene. For exanqple, lung cancer- 
associated genes or sequence encoding subfi:agments of a lung cancer protein are introduced 
into expression vectors and tested for their immunogenicily in the conteTct of Qass I MHC 
and an ability to generate cytotoxic T cell responses. This procedure provides for production 
of cytotoxic T cell responses against cells which present antigen, including intracellular 

15 epitopes. 

In a preferred embodiment, DNA vaccines include a gene encoding an adjuvant 
molecule with the DNA vaccine. Such adjuvant molecules include cytokines that increase 
the immunogenic response to the lung cancer polypeptide encoded by the DNA vaccine. 
Additional or alternative adjuvants are available. 

20 In another preferred embodiment limg cancer g^es find use in generating animal 

models of lung cancer. When the lung cancer gene identified is repressed or diminished in 
metastatic tissue, gene therapy technology, e.g., wherein antisense or inhibitory RNA directed 
to the lung cancer gene will also diminish or repress expression of the gene. Animal models 
of lung cancer find use in screening for modulators of a lung cancer-associated sequence or 

25 modulators of lung cancer. Similarly, transgenic animal technology mcluding gene knockout 
technology, e.g., as a result of homologous recombination with an appropriate gene targeting 
vector, will result in the absence or increased expression of the lung cancer protein. When 
desired, tissue-specific expression or knockout of the lung cancer protein may be necessary. 
It is also possible that the lung cancer protein is overexpressed in limg cancer. As 

30 such, transgenic qni'malR can be generated that overexpress the lung cancer protein. 

Depending on the desired expression level, promoters of various strengths can be employed 
to express the transgene. Also, the number of copies of the integrated transgene can be 
determined and compared for a determination of the expression level of the transgene. 
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Animals generated by such methods will find use as animal models of lung cancer and are 
additionally useful in screening for modulators to treat lung cancer. 

Kits for Use in Diagnostic and/or Prognostic Applications 

For use in diagnostic, research, and therapeutic applications suggested above, kits are 
also provided by Ihe invention. In diagnostic and research ^plications such kits may include 
at least one of the following: assay reagents, buffers, lung cancer-specific nucleic acids or 
antibodies, hybridization probes and/or primers, antisense polynucleotides, ribozymes, RNAi, 
dominant negative lung cancer polypeptides or polynucleotides, small molecule inhibitors of 
lung cancer-associated sequences, etc. A therapeutic product may include sterile saline or 
anotiier phannaceutically acceptable emulsion and suspension base. 

In addition, the kits may include instructional materials containing instructions (e.g., 
protocols) for the practice of the methods of this invention. While the instructional materials 
typically comprise written or printed materials they are not limited to such. A medium 
capable of storing such instructions and communicating them to an end user is contemplated 
by. this invention. Such media include, but are not limited to electronic storage media (e.g., 
magnetic discs, tapes, cartridges, chips), optical media (e.g., CD ROM), and the like. Such 
media may include addresses to intemet sites that provide such instructional materials. 

The present invention also provides for kits for screening for modulators of lung 
cancer-associated sequences. Such kits can be prepared fi^om readily available materials and 
reagents. For example, such kits can comprise one or more of the following materials: a lung 
cancer-associated polypeptide or polynucleotide, reaction tubes, and instructions for testing 
lung cancer-associated activity. Optionally, the kit contains biologically active Ixmg cancer 
protein. A wide variety of kits and components can be prepared according to the present 
invention, depending upon the intended user of the kit and the particular needs of the user. 
Diagnosis would typically involve evaluation of a plurality of genes or products. The genes 
typically will be selected based on correlations with important parameters in disease which 
may be identified in historical or outcome data. 
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Example 1 : Gene Chip Analysis 

Molecular profiles of various normal and cancerous tissues were determined and 
5 analyzed using gene chips. RNA was isolated and gene chip analysis was performed as 
described (Glynne, et aL (2000) Nature 403:672-676; Zhao, et al. (2000) Genes Dev. 14:981- 
993). 
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Aldehyde dehydrogenase 8 


a52 


Z25 


102S81 


U61145 


Hs.77256 


Enhancer of zaste (DrosophBa) homOtog 2 


0.91 


Z46 


102610 


U65011 


H^^743 


Prelerenlially expressed antigen in meta 


1 


3.88 


102623 


U66083 


Hs.37110 


"Melanoma antigen. fariOy A, 9 (MAGE-9)" 


1 


1 


102669 


U71207 


Hs.29279 


Eyes absent (Drosophiia) homolog 2 


1 


1 


102896 


U74612 


Hs^9 


RMltheadboxMI 


1.08 


Z77 


102B29 


U91618 


Hs^962 


Neurotensin 


1 


1 


102888 


X04741 


Hs.76118 


Ubiquifin cart)QKyMerminal esterase LI 


1.13 


Z59 


102913 


Xa7696 


Hs^342 


keiaSnlS 


0.7 


4.72 


102915 


XQ7820 


Hs.2258 


Matrix Metalloproteinase 10 (Stromolysin 


1.15 


3.35 


102983 


X15943 


Hs.37058 


'CalcitDnln/catdtonin-related pdypepti 


1 


1 


103021 


X53587 


HS.8S266 


1ntogrin,b8ta4' 


1.38 


Z34 


103036 


XS4925 


Hs.83169 


Matrix metfidtepretease 1 (Inteisilttal c 


1 


14.93 


103058 


X57348 


HS.184S10 


StraGfin 


lis 


4.17 


103060 


X57766 


H5.155324 


matrix matalloproleinase 11 (stromelysin 
"Cadherin 3. P<cadherin (plaoental)' 


1 


1.72 


103119 


X63629 


Hs2877 


1.16 


7.38 


103206 


X72755 


Hs.77367 


monoUnfi Induced by gamnrainleffsfon 


a7i 


\M 


103242 


X76342 


Hs.389 


"Alcotoi dehydrogenase 7 (dass \^ m 
'Lymphocyte ant^oi 6 comptex, locus D; 


1 


1 


103312 


XB2693 


Hs^185 


0.92 


1.28 


103478 


Y07765 


HsJ8991 


S100 calcium-binding pratetn A2 


1.05 


5.81 


103558 


Z19574 


H5.2785 


keratin 17 


0.65 


6.68 


103576 


226317 


H82631 


De8nioglein2 


0,79 


1.73 


103587 


Z29083 , 


Hs^128 


5r4 Oncofetal anfigen 


1 


as3 


103594 


Z31560 


Hs316 


*SRY (sex detemnining region Y)-bax 2, p 


a7i 


7.23 


103768 


AA089997 




'ESTs, Highly M\a to integral membra 


0l99 




104158 


AA454908 


Hs.8127 


KIAAD144 gene product 


0.96 


1.29 


104558 


R56678 


HsJ8959 


Human DNAsaquenoe from done 967N21 on 


1.23 


7.23 


104689 


AA010865 




ESTs 


0.96 


Z11 


104733 


AA019498 


H&23071 


ESTs 


1.18 


1.88 


104906 


AA055809 


Hs.26802 


Protein kinase domains ooniainihg protel 


1.11 


3.15 


104978 


AA088458 


Hs,19322 


ESTs; Weakly similar to til! ALU SUBFAMI 


1.64 


Z89 


105012 


AA116036 


K&9329 


'Homo saptens mRNA for fl8353, comptete 


1.19 


3.91 


105175 


AA186804 


Hsu!5740 


ESTs; Wsddy similar to unknown [S.oer9V 


0.9 


4.63 


105263 


AA227926 


Ks.6682 


ESTs 


0.95 


Z87 


105298 


AA233459 


Hs.26369 


ESTs 


1 


1.13 


105312 


AA233854 


Hs.23348 


S-phase kinase-assodated protein 2 (p45 


1.32 


aoi 


105719 


AA291644 


Hs3S793 


Hypothetical protein FU23186 


1.28 


Z31 


105743 


AA293300 


H8.g598 


^8 


1 


1 


106012 . 


AM1 1621 


H8.8895 


ESTs; same as GFH6? 


0.94 


Z04 


106231 


AA429571 


Hs^8002 


K1AA1355 protein 


1.04 


1.5 


106540 


AM54607 


Hs.38114 


Hypothetical protein FU11100 


1.26 


Z26 


106575 


AA45e039 


Ks.105421 


ESTs 


1 


2 


106632 


AA459897 


H3.11950 


GPl^chored metastasis-assodated prote 


0.87 


1.32 


106727 


M465342 


Kb^045 


Hypothetical protein FU20764 


0.87 


1.59 


106906 


AM90237 


Hs,222024 


Transcriptbn factor BMAL2 (cycte^ke f 


0.61 


1.6 


107059 


AA608545 


H5.23044 


RAD51 (S. cerevisiae) homolog (E coll Re 


0.48 


Z67 


107104 


AA609766 


Ks.15243 


Nuc)eolarprotein1{120kO) 


1.01 


1.44 


107151 


AA621169 


Hs.8687 


ESTs; procdlagen t-N proteinase 


0.97 


Z89 


107284 


S74039 


Hs.291904 


Accessory p/otefns BAP31/BAP29 


1.15 


aes 


107901 


AA026418 


Hs.gi539 


ESTs 


0.72 


%M 


107922 


AA02802B 


Hs.61460 


Ig superfamily receptor LNIR precursor 


1 


Z48 


107932 


AA029317 


Hs.18a78 


Hypotheficsl protein FU21620 


1 


1 


108695 


AA121315 


Hs,7Da23 


KIAA1077 protein 


0.91 


153 


108B57 


AA133250 


Hs.62180 


ESTs 


1 


1 


108860 


AA133334 


Hs.129911 


ESTs 


a73 


7.3 


108990 


AA15229& 


Hs.72045 


ESTs 


1 


1 


109166 


AA179845 


Hs73625 


'RAB6 interacting, kinesin-like (rabkine 


1 


4.55 


109424 


AA227919 


Hs.85962 


Hyakjronan synthase 3 


1 


1.28 


109665 


R}S012 


Hs.27027 


Hypothettoai protein DKFZp762H1311 


1.42 


2 


109970 


H09281 


H5.13234 


ESTs 


1.13 


Z16 
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110015 


H10998 


Hs.7164 


A (fisintegrin and metalloproteinase doma 


OM 


1.95 


110156 


H18957 


Hs.4213 


ESTs 




1.41 


110561 


H59617 


Hs.5199 


HSPC1 SO protein similar to ubiquifiivcQn 




3.18 


111223 


^68921 


Hs.34806 


ESTs; Weakly similar to naogenin [H^i 


0^ 


3.13 


111345 


N89820 


Hs.14559 


Hypotheticai protein FU10S40 


1 


1.25 


111876 


R38239 


Ks.293246 


'ESTs. Weakly similar to putative p150 [ 


0.83 


1.Z7 


111902 


R39191 


Hs.109445 


KIAA1020 protein 


0.91 


091 


112244 


R51309 


Hs.70823 


K1M1077 protein 


0.77 


3.01 


112973 


T17271 




-CDMA FU1330B fis. done OVARC1001436. 


1 


1 


112989 


T23482 


Hs.89981 


*D!acylo)yoeral kinase, Z8ta {104kD)- 


055 


1.03 


113047 


T25867 


Hs.7549 


ESTs 


0^ 


2 


113095 


T40920 


Hs.126733 


ESTs 


1 


1 

1.44 


113531 


T90345 


Hs.16740 


Hypotlietk^l protein FLJ11036 


a42 


113970 


W86746 


Hs.8109 


ESTs 


1.17 


1.73 


114346 


Z41450 


Hs.130489 


'ATPase, amlnophasphaiipM transporter-l 


0.66 


052 


114407 


AA0101B8 


Hs.103305 


ESTs 


OJ 


158 


114471 


AA028074 


Hs.104613 


RP42homotog 


1.06 


1.34 


114509 


AA043551 


Hs.101799 


KIAA1350 protein 

"Gap junction protein, beta 5 (connexin 


1.82 


2.32 


115060 


AA253214 


Hs.198249 


0.79 


1.49 


115091 


/\A255900 


Hs.184523 


KIAA0985 protein 


a72 


152 


115123 


AA2S6642 


Hs.236894 


-ESTs. High sim to U^l.hu low density ) 


059 


1.97 


115291 


AA278943 


Hs.122579 


ESTs 


1 


1.25 


11S506 


AA292537 


Hs.45207 


Hypotttetica) protein KIAA1335 


1.15 


1.48 


115522 


AA331393 


H5.47378 


ESTs 


0.5 


3.29 


115536 


AA347193 


Hs.62180 


ESTs 


1 


1 


115697 


AM11502 


Hs.63325 


Homo saf^s type U membrane seiine pro 


1 


6.53 


115909 


AA436666 


H&S9761 


ESTs ^ 


1 


&g8 


115978 


AA447522 


H^.695l 7 


[^^srentially expr^sed in Fanooni anem 


1 


2.31 


116028 


M452112 


Ks.42644 


thk}redo)dn-like 


099 


1.68 


116107 


AA4S6968 


Hs.g2030 


ESTs 


1.14 


15 


116134 


AA460246 


Hsi0441 


CGMMproiein 


1,11 


1.86 


116157 


AA461063 


H6.44298 


HypottietKal protein 


0.99 


15 


116156 


AA461187 


Hs.61762 


HyixBda-inducaile protein 2 


0.44 


0.66 


116335 


AA4g583Q 


Hs.87013 


-Homo sapens cONA FU10238 done H 


0.62 


189 


116483 


C14092 


Hs.76118 


Ut^uitin carboxyMennina) esterase LI 


1.04 


2.36 


117320 


N23239 


Hs^11092 


LUNX protein; PLUNC(patete lung & nasal 


051 


064 


117557 


N33920 


H5.44532 


DiubiquHin 


1.11 


2.63 


117693 


N40939 


H5.112110 


PTD007 protein 


098 


1.79 


117881 


N50073 


Hs260622 


Butyrate-induced transcript 1 


1 


1.43 


116368 


N54339 


Hs.48956 


ESTs 


0.67 


286 


118566 


N6B550 


Hs.42624 


Hypottieticel protein FU10716 




083 


116695 


N71781 


HS50081 


KIAA1199seeCVA7^ 


088 


1.63 


119780 


W72967 


H8w191381 


ESTs; WeaMy simiter to hypolheikal pro 


1 


1 


119845 


W7g920 


Hs^561 


G proteffHXNipted ncoptat 8f7 


1 


1 


120102 


W95428 


Hs.132927 


-ESTs. ModeratelysMtartop53regUiat 


1 


1 


120104 


W95477 


Hs.180479 


ESTs 


069 


3,07 


120486 


AA253400 


H8.137568 


Tisnor prcrteln 63 kOa with strong homobg 


1.08 


12.05 


120859 


AA3501S8 


H&1619 


Achaet&«cuto comptex (Drosophfla) homd 


1 


1 


120880 


AA360240 


Hs.97019 


EST 


1 


1 


120946 


AA397822 


Hs.104650 


Hypotiietical protein FLI10292 


1.04 


2.15 


120983 


AA3g8209 


Hs.97587 


EST 


1 


1 


121362 


AA405500 


Ha.97932 


QuMidromodidn 1 precursor 


1 


1 


121369 


AA40S557 


Hs.128791 


CGMSprotetn 


1 


15 


121791 


AM23978 


Hs.293317 


"ESTs, Weakly simSar to JM27 {H.sapien8 


1 


1 


123005 


M479726 


Hs.105577 


ESTs 


1 


1 


123044 


AA481M9 


Hs.130881 


&«eil CH/lymplioma 1 1 A (zinc finger pro 


0.95 


158 


123160 


AA488687 


H8.28423S 


ESTs 


1.59 


458 


123479 


MS994e9 


HS.1350S6 


done RP5-850E9 on dvoniosome20 


1.19 


1.64 


123571 


AA608956 


Hs.112619 


-ESTs. W^ sinflar to PQ0109 PurMr^ 


1.03 


1.14 


123829 


AA6^597 


Hs.1 12208 


XAGE-1 protein 


1.39 


2.2 


124(M)6 


060302 


Hs.108977 


ESTs 


1 


4.85 


124059 


F13673 


HS.997G9 


ESTs 


1.49 


062 


124960 


T15386 


Hs.194766 


Seizure reteted gene 6 (mouse}^ 


0.76 


0.77 


125218 


W73561 


Hs.110024 


NADHtubk^uinone oxidoreductase MU^Q subu 


1.33 


1.77 


125453 


f^6041 


H5.18048 


"MatenomaanOgen. family A, 10" 


08 


1,42 


125759 


AA425587 


Hs.82226 


Glycoprolsin (bensniendirane) mrto 


1.S2 


2.26 


128972 


AA4345G2 


H&35406 


-ESTs, Highly simiter to unnamed protein 


1.05 


2.48 


125994 


H55782 


Hs270799 


EST 


1 


1.95 


126395 


N70192 


Hs^78956 


Hypottietteal protein F1J12929 . 


1 


1.35 


126645 


A1167942 


Hs.61635 


STEAP1 {HomosaptensBACdoneRG041011 


1 


2.23 


127221 


AI354332 


Hs.72365 


ESTs 


0.73 


3.27 


127479 


AA513722 


Hs.179729 


ooBagen; ^ X; alpha 1 (Schmid meteph 


051 


154 


128192 


AI204246 




KIAA1085protdn 


15 


3.16 


128610 


L38606 


Hs.10247 


acfivated leucocyte cell adheston motecu 


0.89 


097 


128777 


U46006 


Hs,10526 


Cysteine and glydne^lch protein 2 


1 


1 


128924 


AA234g62 


Hs^6567 


PtekopMlin3 

"Sdute carrier temHy 2 (Militated gl 


U 


2.97 


128041 


H56B73 


H8.169902 


054 


2.04 


129099 


H50398 


Hs.10a660 


-ATP^Inding cassette, sUb-temily C (CFT 


057 


1.04 


129404 


AA172056 


Hs.111128 


ESTs 


1 


1 


129466 


U2S83 




'Genbank Homo sapiens keratin 6 isofomi 


072 


12.67 


129605 


S72493 


Hs.115947 


Ketalin 16 (tocat non^pidennolytic palm 


092 


15 


129628 


U28727 


Hs.1174 


"Qycnn-depandent kinase InhtbHo- 2A (m 


085 


1.93 


130023 


X13461 


Hs^gooo 


CebnodulMlksS 


084 


1JQ 


130080 


X14850 


Hs.147097 


'H2A htstone famBy, member X" 


098 


1.96 


130385 


M126474 


HS.1SS223 


stsnrtiocatein 2 


1 


1 
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130410 


V01514 


Hs.155421 


Alpha-fetoprotetn 


a63 


0.63 


130441 


U35635 


Hs,3013B7 


•Human DNA-f K mRNA. partial cdsT 


1.15 


3.65 


130482 


L32B68 


Hs.1578 


Bacuioviral lAP repeakontainlng 5 (sur 


1 


1.88 


130553 


AA430032 


Hs.252587 


Pituitary tumor-transforming 1 


a92 


1.96 


130577 


M35410 


Hs.162 


Insuniviiice growth factor binding prats 


1,17 


4.7 


130527 


L23808 


Hs,1695 


Matrix matalloprotsir^ase 12 (macrophaga 


aeo 


4w05 


130800 


AA223386 


Hs.19574 


ESTs,' Wealcly similar to Itatanin p80 subu 


1.13 


2.41 


130939 


AAS98688 


Hs.21400 


ESTs 


0.8 


0.89 


131046 


X02530 


Hs^48 


INTERFERON-GAMMA INDUCED PROTEIN PRECURS OJ 


1.15 


131244 


D38076 


Hs^4763 


RAN liinding protein 1 


1.13 


1.85 


131877 


J04088 


Hs.156346 


Topoisomerase (DMA) II alpha (170kO) 


1 


1 


131927 


AA461549 


HS.347B0 


'Doubleoortex; Ussencephaly, XJinked { 


0.81 


a62 


131965 


W90146 


Hs.35962 


ESTs 


0.74 


a27 


131978 


D80008 


Hs.36232 


KIAA01 66 gene product 


1 


1 


132354 


L05187 


Hs.211913 


Small proUne^ch protein 1A 


0.69 


1.43 


132543 


AM17152 


Hs^lOI 


ESTs; Highly similar to protein regulaU 


0.79 


4.27 


132632 


N59764 


Hs^98 


guanine^nophosphate synthetase 


1 


1.08 


132653 


U31201 


Hs.54451 


'laminin gamma2 chain gene (LAMC2), exon 


1 


1 


132659 


Z75190 


Hs.54481 


'Low density Pipoprotein receptor-relate 


0.89 


0.69 


132710 


W93726 


Hs^279 


"Serine (or (Tsteine) protonase hhMi 


0.64 


4.41 


132758 


W52432 


Hs^6105 


-ESTs. Weakly simaar to WDNM RAT W0NM1 


1.55 


2.06 


132767 


L05188 


Hs.231622 


Small proltne^ch protein 2B 


ass 


1.66 


132816 


M74542 


Hs.575 


Aldehyde dehydrogenase 3 


0.55 


0.55 


132990 


AA458761 


Hs.18387 


transcnp&on factor AP-2 alpha (acthrat 


1 


3.53 


133070 


U69611 


Hs.64311 


'A disintegrin and metalloproteinase dom 


1.16 


2 


133262 


U52960 


Hs.286145 


'SRB7 {suppressor of RNA polymerase B, y 


1 


2.7 


133317 


AA2152g9 


Hs.70830 


U6 snRNA-associated Sm-llke protein LSm7 


0.95 


1.42 


133370 


AA156897 


Hs.72157 


Homo sapiens mRNA; cDNA DKF2p564l1922 


1.12 


Z55 


133391 


X57579 


H5,727 


Rsapiens activin betaAsubunit (exon 2 


1.65 


1.76 


133832 


H03387 


Hs^41305 


estrogen-responsive 6 box protein (EBBP) 
'Serine (or cysteine) proteinase inhibit 


1.02 


1.39 


134032 


ZB1326 


Hs.78589 


1 


1 


134168 


AA398908 


Hs.181634 


'Homo sapiens cONA: FU23602 lis, done 


a95 


1.53 


134218 


AA227480 


Hs.80205 


Rm-2 oncogene 
""coHagan, fype XI, ai^ha 1*" 


1.36 


Z48 


134405 


R67275 


Hs.82772 


a78 


m 


134453 


X70683 


H8.83484 


SRY (sex determining regioo YH}qx 4 


1.89 


3.76 


134470 


X54942 


Hs.83758 


CDC28 protein kinase 2 


1.82 


4.11 


134645 


U87459 


Hs.167379 


"Cancer/testis antigen (NY-ESO-1, CTAG1. 


a82 


0.83 


134781 


M17183 


Hs^26 


Parathyroid hormofie-IBoB honnone 


1 


1 


135002 


U19147 


Hs^2484 


G antigen 6 


1 


1 


100040 


M97935 




AFFXcontrtiliSTATI 


0J2 


1.25 


101201 


L22524 


Hs.2256 


mdrix metalloproteinase 7 (matritysfai; 


2.92 


8.5 


101664 


M60752 


Hs.121017 


H2A histone family; member A 


1 


1 


102025 


U03911 


Hs.78g34 


mutS (E. coll) homobg 2 (col&n cancer; 


0.8 


1.61 


102031 


U04898 


Hs.2156 


RAR-relalsd orphan receptor A 


1 


1 


102221 


U24576 




UM domain on^ 4 


1 


1 


102270 


U30255 


H&75888 


phosphoghioonate dehydiogenase 


ixe 


1.43 


102339 


U37022 


Hs.95577 


cyd In-dependent Mnase 4 


OJB 


1.32 


102391 


U41668 


Hs.77494 


deoxyguanosiiie kinase 


1.07 


1.58 


103000 


XS1956 


Hs.146580 


enolase 2; (ganvna; neuronal) 


0.91 


1.49 


103395 


X94754 


Hs.119503 


methionine-tRNA synthetase 


0.89 


1.32 


105638 


AA281599 


Hs^0418 


Homo sapiens mRNA tor for histone H2B; c 


0.91 


1.25 


105726 


. AA292328 


Hs.g754 


acavaiing iranscnpaon taciv s 


0.94 


1.48 


114841 


AA234722 


H5.55408 


ESTs: Moderately simBar to CALCIU1IM)EPE 


0.78 


1.56 


115205 


AA262491 


Hs.186572 


ESTs 


1 


1 


115906 


AA436616 


Hs.82302 


ESTs 


0.74 


2.52 


119132 


R4g046 


Hs.107911 


ATP-fainding cassette; sub-family B (MDR/ 


1.1 


1.51 


124163 


H30539 


Hs.189838 


ESTs 


1 


1 


126487 


AA482505 


Hs.184601 


solute carrier family 7 (cationte amino 


1.01 


1.46 


127141 


AA3079B0 


Hs .75478 


KIAA0956 protein 


0.85 


1.4 


128034 


M905754 


HsJ5103 


tyrosine 3-monooxygenase/tryptophan 5-mo 


1 


1.18 


128609 


AA234365 


Hs.102456 


surviva! of motor neuron protein interac 


1 


1.5 


128895 


R37753 


HS.10698S 


ESTs 


1.7 


2 


130199 


Z48579 


Hs.172028 


a disintegrin and metalk^rotease domain 


1 


1 


130524 


U89995 


Hs.15g234 


forkheadboxEl 


1 


1 


133000 


U24152 


Hs.62402 


p21/Cdc42/Rac1-activated kinase 1 (yeast 


1 


1 


133858 


M25756 


Hs.75426 


secretogranin li (chromogranin C) 


1 


1 


135047 


AA460466 


Hs.g3597 


ESTs 


1 


1 


100053 


M27830 




AFFXoontiol: 28S libosomad RNA 


a88 


1.53 


100114 


000596 


Hs.82g82 


thymidylale synthetase 


168 


1.86 


100128 


D11094 


Hs.61153 


proteasome (prosome; macropain} 26S subu 


1.29 


Z03 


100154 


D14657 


Hs.81892 


K1AA0101 gene product 


. a7i 
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Hs.165909 


ESTs 


0.92 


1.8 


117602 


N35020 


Hs.44685 


ESTs; Weakly siirflar to GOLIATH PROTEIN 


1.15 


184 


117950 


N51394 


H5.7M78 


KIAA0956 protein 


1.04 


236 


117992 


N52000 


Hs.172089 


Homo sapiens mRNA; cDNA DKFZ|)586B0222 (f 


0.62 


1.29 


118785 


N75386 


Hs.111867 


GU-Kruppel family member GU2 


1 


1 


119717 


W69134 


Hs.57987 


ESTb 


1 


1.4 


119814 


W74069 


Hs.56350 


ESTs 


a78 


1.77 


120128 


Z38499 


Hs^1448 


MKP-1 like protein tyrosine phosphatase 


a86 


1.46 


120242 


298443 


LL. oeoee 
nS4)O0DD 


ESTs 


a83 


201 
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120463 


AA252994 


Hs.1578 


apopiosis inhibitor 4 (survivin) 


121054 


M396604 


HsJ7387 


ESTs 


121326 


AA404246 


Hs.97031 


ESTs; Vfeaidy similar to SimSar to phytD 


121378 


AA405699 


Hs.166232 


ESTs; Moderately sanlar to SODIUM- AND 


121457 


AM1 1448 


Hs^85 


ESTs 


1217B0 


AA422086 


Hs.124660 


ESTs 


1217B1 


AA4221S0 


Hs38370 


cytDchnsns PS40 fsn&jf msfrb^ predicted 


121844 


AA425732 


Hs.98485 


gapjundion pooteiii; beta 2; 26kD (conn 


122059 


M431737 


Hs.98749 


EST 


122338 


AA443311 


Hs.98998 


ESTs 


122354 


AA443772 


Hs.186692 


ESTs 


122591 


AA453265 


H&99311 


ESr8;\toak<y8imOartoMRJ [Ksapiens] 


122790 


AA4601S6 


HS.995S6 


ESTs 


123398 


AA521265 


H8.105514 


ESTs 


123518 


AA608531 


Hs.170313 


ESTs 


123673 


AA609471 


Hs.112712 


ESTs 


124000 


D57317 


Hs.74861 


acGioti^ RNA polyinsrsss li trsnscriptio 


124367 


N24006 


Hs.99348 


dIstsMess tomeo box 5 


124447 


N48000 


Hs.140g45 


Homo sar^s mRNA; cDMA DKFZpS86L141 ^ 


125756 


v\^gd 


Hs.81634 


ATP synthase; transporting; mitodrand 


125769 


A1382972 


Hs.82128 


5T4 oncofela) traphobiasl glycoprotein 


125852 


K09290 


H8.78550 


Homo SQpiens mRNA; cONA OK)=Zp564B1284 (f 


125924 


AA526849 


H&82109 


syndocsn 1 


126037 


MB5772 


Hs.6066 


KIAA1112 protein 


126214 


N29455 


Hs.74316 


desmoplakln (DPI; DRQ 


126414 


N7B770 


Hs.223439 


ESTs 


126737 


AA488132 


Hs^41 


ESTs 


128743 


M179253 


H8.172182 


po]j^)4}bidIng protein; cytoplasnric 1 


126926 


M179546 


H&832 


ESTs; H^hlysimitarlDlNTEGRINBETA^ 


127432 


AAS01734 


Hs.l 70311 


heterogeneous nuclear ribonudeoprolein 


128218 


H02682 


Hs.99189 


ESTs; Moderately similar to reoomUnaGo 


128527 


M31523 


Hs.101047 


IranscripfioiY liactof 3 (E2A invnunofliobid 


128568 


XB0573 


Hs.247568 


adenylate Idiiase 3 


126584 


M11433 


Hs.101850 


retinoMunding protein 1; celluter 


12B628 


CI 4037 


Hs.251978 


EST 


128691 


VV27939 


Hs.103834 


ESTs 


128714 


VDOSM 


Hs.17g661 


Homo saptens dons 24703 beta-tubuDn mR 


128733 


AA328993 


Hs.104558 


ESTs 


128781 


X85372 


Hs.105465 


small nuclear r9)onucteoprQtein polypepl 


129052 


AA498297 


H&182740 


ribosomal protein S1 1 


129095 


L12350 


Hs.108623 


thrombospondin 2 


129241 


M435665 


Hs. 109706 


ESTs; Moderately sMar to HN1 \pimuseu 


129665 


M88458 


Hs.1 18778 


KDEL (Ly&Asp^Leu) endopiasniic reSc 


129703 


AM01348 


Hs.179999 


ESTs 


129720 


AA476SB2 


Hs.12152 


ESTs; Moderately stnrular to SIGNAL REOOG 


129850 


N20593 


H$J6845 


GDP dissociation InhMor 2 


129896 


AA043021 


Hs.13225 


UDP-Gal:b6taGlcNAc beta 1;4- gatedosyR 


130069 


AA055896 


Hs.146428 


coUagen; ^e V; alpha 1 


130405 


H88359 


Hs.155396 


nudear factor (efythrckMiedved 2HB( 


130541 


X05608 


H&211564 


neurofilament; Gght (K)lypep6de (6Bkp) 


130599 


M9167Q 


Hs.174070 


ulxquiiin carrier proidn 


130867 


J04093 


HS.20S6 


UDP gtyoosyltransteiase 1 


131009 


AA063596 


Hs.22142 


ESTs; Vtteakly similar to NAOH-CnOCHROME 


131028 


U2024fl 


Hs.2227 


CCAAT/enhancert^ing protein (C/EBP); 


131083 


U66661 


Hs.22785 


ganma^nobutyric acid (6ABA) A recepto 


131091 


T35341 


Hs.22880 


ESTs; Highly siiriiiar to dipeptidyl pepD 


131144 


C14412 


Hs^3S26 


ESTs; Wly strntter to ttSPC038 protein 


131148 


C00038 


HS.23S79 


ESTs 


131164 


Y00503 


HS.18226S 


keratin 19 


131185 


M25753 


Hs^3960 


cycHn B1 


131219 


000476 


Hsi4395 


smaii toduclble cytokine subfamily B (Cy 


131454 


AA455896 


H&2699 


glyptoani 


131687 


LI 1068 


Hs.3068 


heat shock 70kO protein 9B (mortaIIn-2) 


131689 


AA599653 


Hs.30696 


transcription tector-Ilcs S (baste he&x 


131692 


D50914 


Hs.30736 


KIAA01 24 protein 


131786 


AA135554 


Hs.32125 


ESTs 


131843 


AA195893 ~ 


H8.1 84062 


ESTs; Moderately similar to pirtaGve Rsb 


131860 


U02082 


Hs.334 


Oncogene TIM 


131884 


H90124 


Hs.3463 


ribosomal protein S23 


131903 


AA481723 


Hs.3436 


deleted in oral cancer (mouse; homolog] 


131945 


M87339 


Hs.35120 


replication factor C (adhrator 1) 4 (37 


131958 


AA093998 


Hs.3566 


ESTs: Highly similar to phosptwryiaBon 


131954 


W4250B 


Hs.3593 


ESTs 


132001 


J00277 


Hs.37003 


v4{a-ras Harvey rat sarcoma viral oncoge 


132040 


M146843 


Hs.172894 


BH3 interacting domain death agonist 


132065 


D82226 


Hsi11594 


pioteasome (prosome; maoopain) 26S subu 


132109 


AA599801 


Hs.40098 


ESTs 


132112 


AA150661 


Hs.40154 


Jumonji (mouse) homolog 


132123 


M447123 


Hs^50705 


ESTs 


132162 


H8g551 


Hs.41241 


ESTs 


132180 


AA405569 


Hs.418 


fibroblast activation protein; alpha; se 


132309 


AA460917 


Hs^780 


lunDproto^moogeiw 


132371 


AA235448 


Hs.46677 


ESTs 


132618 


AA2S3330 


Hs^44 


adaptor-related protein complex 1; gamma 


132736 


U88019 


H&211578 


MAD (mothers against decapenteptogic; Dr 



a74 


1.64 


1.05 


1.93 


0.98 


1.3 


aoi 


1.83 


a9i 


1.59 


0.46 


0.55 


1.07 


1.54 


a94 


1.4 


1^ 


233 


1 


1 


asB 


1.39 




293 


asB 


1.3 


1 


1.93 


1 


1 


1 


1.15 


a74 


1.12 


a67 


1.1 


1.19 


1.7 


0.93 


1.59 


1.65 


6.76 


0.72 


226 


1J22 


225 


1.36 


1.63 


1.93 


3.55 


1.21 


1.66 


1 


1 


1.3 


216 


253 


28 


^Jsr 


212 


1.24 


209 


1.0B 


1.78 


1^ 


3.48 


0.87 


242 


1.22 


1.9 


1.1 


1,73 


0.82 


1,17 


1.34 


1.94 


a9 


134 


2ii9 


ai9 


1.04 


zst 


0J5 


1.61 


1.28 


263 


0.97 


1.63 


1.09 


1.79 


0.74 


1.68 


1.43 


4.19 


1.17 


1.98 


1.28 


1.79 


1 


1 


1.07 


1.66 


1 


4.6 


0.93 


1.05 


1 


1,23 


1.1 


1.8 


liB 


1.98 


1.43 


206 


0J8 


3^ 


1.19 


277 


0.86 


3.84 


0.66 


296 


0J99 


1.54 


1 


1.18 


1 


1.95 


lis 


239 


1 


1.33 


0.83 


1.63 


1X)8 


22 


1.23 


1.24 


0.91 


1.18 


1 


28 


0.87 


1.36 


1 


1.25 


1.12 


1.43 


1 


1.55 


0.89 


1^ 


1 


1.05 


0.99 


1.44 


1.06 


246 


1.08 


246 


1.02 


4.56 


1.16 


1^ 


as 


1.26 


OlS 


1.49 


1.21 


1J1 
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132771 


AA488432 


H&56407 


DhnfinhdSfirina ohosDhfltdSfi 




U78525 


HSi57783 


iHikanffdr Iranftlflllrin InBiaKofl fectar 

fnlntCB pWy UPllilliillllll lIMIflUMil lOlnUt 






H&6086 


KIAA1119nrole]n 


1^959 




R1472 


PRTe* lAhstsMv eimilar In tinknouni ISjifiifiV 




AAOUOI 00 


M«7<iQd 


SOIUw bduici laliuiy A \laCIMsH60 


133005 


C21400 




KIAAOQTTl nrn\tan 


1JJUD9 


ADZOOD 


Hft lyoeofl 

nS, 1 f 4D9U 


oiacyiyiycciui riik»0| cuiJiia \ouai^/ 


lOOUOJ 


iMrUDOO 


Hs.6456 


rhanomnSn mnfnimnn TOPI* stihllflif 9 fb 


13^)86 


L17131 


ru. U90UU 


hlnh-trnhilihr nrrun fnflnfiiBlDna dtmnVKQ 
in^iriiiuuiiuj yiwup iiimiihbiuiki mhuiiiwbw 




T89703 




RKA Hn/i'mn mnfit otBisin fl 


133195 






to A A1 007 nmtfdn 


133313 


AA249427 


HS.707D4 


ESTs 


133331 


T62039 




rii)osoni3fl proton L>14 


133438 


013370 


H8.73722 


AP^ nucte^fi /iniiHifiiiicllDiial DMA reoai 


133445 




Hs 73797 


flufinind nucleoiidd Undkn DfOtdn (G or 


133483 


X52426 


HsJ4070 


keratin 13 




L40397 




tf fli tsii ismbrsns bsIGckinQ prokio 






Hs.74316 


HpemnolaVin /DPL* DPID 




YB9QA7 


H8.74471 


nan nincHnn nrntetfi* fllnhs 1* 4Akn /con 


IJdO'tV 


UlOlO 1 


H8.74619 


nmtoaRAmo ftwnsonip* macmnain) 9fiS suhu 




UUr rOO 


Hs 1 79*189 

no. 1 r mOs 


nirrlnar nhnenhnnmtftin lilmllar lo S Cflf 


lOOtur 


UUmOr 




giycyi-irvHn synuiciaso 


lOOO/ 1 




LU7C474 
ns.rOH/ 1 


vin/* finnnr nmtAtn 1 AR 


100009 


UOO/OZ 


Me 17R7ft1 

ns. 1 f Of 01 


ofiQ nmfo9enintt.iKsfy^o1pri fwHI hflnmlnn 
£00 prow»oniiwis8widiPU yoa 1 iiuiiMuy 


100000 


F09315 




Hiem* Iflmp /nnMonhllaft hoRuioa S 


looslo 


UUQ/74') 


He 77*M 


C9uniDnin 


100«lOO 




Us 4Q^Q3 

ns.io*K)90 


uanscnpuon ciungawn lacwi d \o\nj 


133982 


U47621 


His 907251 


nitripntflr aiilnflntiQnn fSSkDl fiimBar to 


134100 


L07540 




nmRrnfran fnrfnr ^ farfivahir 11 5 f3fi 




lUQU 


He 7Q1^(i 

ns.f 9 100 


1 IV.I nmlBin' petmnpn nvnilatftH 

UIV 1 piUUSMIt tSMIWyCII IDyUIOUBU 


134156 


U15174 


Hs.79428 


ROl 9/ffifpnavinK 19lcD4ft}araeljiia on 


134161 


U97188 


Hs.79440 




10*1 190 


rvwf U 


Hs.7980 


ESTs 


1040Df 


KOt 199 




nhnsnhnriVvvtvifllvcinflRudA tonrwliransfBr 


1044UZ 


UZ91D9 


Hs. 82712 


franHo y mantal mfarrbtt^n* fllltncnfTtsI 
llo^uo A iiioiiidi loiaiiioumi; ounMUuioi 


41il/!fr7 

lO*l*K>f 


UODsDO 


Hs.174044 


riiehmfaltpH ^ ninmnlnorui!* to nmsfiohila 


104403 


Y47SR7 
All OOf 


He jnTn 
n5.ooioo 


email niiplQar rihnnimlnnnmlldn nnlwWMll 


49>l>IQn 
104490 


MDOlOU 


ns.04 101 


fhranmiLtRMA cvnIhphu&A 


4<9Acni 
1049Ui 


WOWfU 


He 911<i(iR 


ntikarvnitf* fmnslalinn iirifisfion fador 


l449Uf 


M004O0 


nS.D4010 


repiicouon pioiBinni ^fujujj 


1040*10 


n't 1010 


He R591() 


DpIotAH in eniH.h9nHfei£t.firYit 1 reoto 


104099 




ns.oO£9f 


rancoiii BnoiTiisii conipioiwiiuQun grou^ m 


134692 


R73567 


Hs.8850 


a disintsgrin and metsdioprotdnass donia 


134693 


N70361 


Hs!8854 


ESTs 


134806 


Z49099 


83.89718 


spOTitine synthase 


134821 


Z34974 


Hs.198382 


plakophilin 1 (ectodermal dysplasia^ 


134864 


Y08999 


Hs.90370 


acGn related protein 2/3 complex; siAun 


134914 


U29615 


H5.91093 


ch»inase1(cWiosidas8) 


134953 


L1067B 


Hs.91747 


profiDna 


134993 


AA282343 


Hs.g242 


purine-dch dement binding protein B 


13S0S1 


C15324 


Hs.93668 


ESTs 


135158 


U51711 




Hutnan desfflWollin-2 mRNA; 3* UTR 
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Ql91 


1.43 


1.16 


153 


1.02 


188 


Ol72 


297 


0.88 


1.34 


0.93 


123 


1.14 


1.76 


1X97 


1.43 


1.1 


18 


£29 


Z69 


1.07 


1.68 


0.85 


118 


091 


145 


0.94 


1.68 


085 


114 


1.1 


169 


0.7 


&21 


0.95 


1.3 


0.91 


125 


0.64 


129 


1.09 


199 


102 


1.5 


ill 


3.33 


184 


6.7 


1.15 


186 


1.3 


191 


1.3 


199 


0.72 


165 


1.04 


162 




155 


0.82 


195 


0.96 


148 


y 


za 


126 


2 


\ 


147 


0.94 


1.57 


1.2 


2.64 


084 


136 


17 


2.93 


146 


273 


1.36 


222 


in 


1.64 


109 


1.82 


0.98 


. 1.35 


a99 


1.4 


095 


142 


1.16 


1.29 


0.65 


176 


0.96 


173 


1.35 


211 


0.86 


116 



Tat)!e IB shows the accession numtm for those pkeys In T^ 1 A lacking unigenelD's. For each probeset we have listed the gene cluster numtier from which the 
oGgooudeotides were designed. Gene dusters were compiled using sequences derived from Genbank ESTs and ml^NAs. These sequences were dustered based on'sequence 
simiia% using austeriig and Alignment Tools (DoubleTwisl OaMand Caiifomia). The Genbank accession numbers for sequences comprising each duster are listed In the 
Acottsion column. 

Picey: Unique Eos probeset identifier mimber 
CAT number Gene duster number 
Accession: Genbank acoesskm numbers 

Pkey CAT Accessions 

100661 23182 1 BE623001 L05095AA383604AW966416N53295AA46Q213AW571519AA&03655 

100667 26401 J li)5424 X56794 S66400 X55150 W60071 AW351820 X55938 M83326 BE005289 BE070059 M83324 BEO0S248 BE069717 BE181648 BE069700 

AyV608203B£069721AW382138AW803776BB«63954BE0053346E00^4T27386AAS32714AA972696AW37^ 
A1783934 AW377727 BE16371 5 AL047291 AA279047 AA523003 BE008048 BE440141 W23614 BE090519 BE092193 N29181 M20358 N44153 
" BE546944 T69231 AW377441 AA907406 H50799 AW051416 AI420712 BE620922 A1279161 AA932549 W47198 BE005241 AI342696 H5O7O0 
AI969974 AIB63855 AA3744gO AW130675AI9S0633 AA146687 H99482 X55150 BE005414 BE005339 N28294 A1673068 Ai887890 AWB04171 
AI675961 AW804172AA778841 AL048a50Alt27757A]095568 AW204g65AW468978 W31698AI05259SAI27B771 BE464018 Ai061503 AI824ig6 
AA513211AA411052AVI/084376 N48752AA703209 N35560AVV05991BAAD54563AI280942T27619B^1^ 
AA283090 AAg62536 H82726 W521 15 W45432 W60433 AA577548 AA146714 BE150994 AA054615 AW796025 AW382768 BE565671 C00444 
AA054555 

100668 »401J 105424X56794866400X55150 W60071 AW351 820 X55938 M83326 BE0052B9 BEO70O59 M83324 BE005248 BE069717 BE181648BE069700 

AW606203BE069721AVV382138AW803776BE463954B£OQ5334BE005274T27386AA932714AA972695AW377728AI63250^ 
AI783934 AW377727 BE163715 AL047291 AA279047 AA523003 BG006048 BE440141 W23614 BE090519 BE092193 K29181 N20358 N441S3 
BE546944T69231 AW377441 AA907406 H50799 AW051416AI42O712BE620922AI279161 AA992549 W47198 BE005241 AI342696 H50700 
AI969974 AI863855 AA374490 AW130675 A1950633 AA146687 H99482 X55150 BE006414 BE005339 N28294 AIB730B8 AI887890 AW804171 
AI675961 AW804172AA77B841 AL048050AI127757AI095568 AW204965AW468978 W31898Aia52595AI278771 BE464018 AI081503AI824196 
AA513211AA411062AW084376 N48752AA703209N35580AW059918AA054563AI280942 T27619BE621435 N6 
AA283090AA962536H82726 W52115W45432W60433AA577548AA146714BE150994AA054615AVm60^^ 
AA054555 

101332 25130 1 J040B8 NM-001067AF071 747 AJO1 1741 N85424AL042407AA218572BE296748BE083981AL040^ 
6E081283AA670403AV\€04327Bai94229AA104024AI47l482A1970337AA737616AI827444 
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A)948838AW235336AW172827 M095209 8EM6383Al734240VV16699A1660329AI28g433M933m 
AA62S873 W7e031 BE2D63(^AA5S08(eAI743147A199007SAA948274AA12g533AI635399AA605313A^^^^ 
AA307706 BE550282 AI760467 AI630636 AI221521 AW674314 AWD78889 Aig33732 AI686969 A)166928 AWD74595 All 27486 AL079644 
AI910815H17B14AA310903AW137854T19279AA026682AA306035AW383390AVV383389AVV3B34 

AA306247 AA352501 AW4C3639 F0S421 AA224473 AA305321 H93904 AA089612 AW391543AW402915AW1733B2AW402701 AW403113 
R94438 N73128 H934e6 AA090928 AA095051 T29025 AW951071 L47277 L47276 AI375913 BE3841 56 W246S2 AA746288 AA568223 BE090591 
H93033 N57027AA504348AA327653AVV96891dN53767AA843715AI4S3437AVV263710AI^^^ 

AI803319ALJ042776AWD74313AI887722AI032284AA447521 A1123885 N29334AI354911 AW0g0687AA238763AA435535AA236910 
AA047124 AA236734AW514610 H93467AA962007AI446783AA127259AI613495 AI686720AI587374 AA936731 AA702453 AI859757 
AA21 B786 A1251B19 A1469227 AABC6022 A!(m24 1^1868 AA968782 AA23691 9 AA809450 AA227220 AA765284 A!19^^ 
AA80S794AA729280AA806238AW768817N71879A1050686AA50S822AA668974AI688160BE045915AW46 
AA834316 AV\ffi91901 AWD63876 AW294770 AI300266 At338094 AIS60380 AA721755 H09978 020305 D2915S AWB217gO BE150864 F01 675 
A1457474AW466316AA5509G9AA630788 
100780 458 127 BE561958 BE561728 BE397612 BE514391 BE269037 BE514207 BE562381 BE514256 BE514403 BE514250 BE397832 BE269598 BE559865 

BE3g6881 BE56Q031 6E514199 BE560037 BE560454 
100B30 4002 1 AC004770 W05005AA356068AA094281 H29358 T56781 AWB75313 L37374 BE312466BE311755BE207106 BE293320 BE018115AW239090 
BE548830 AW247547 AA7760S2BE397382AA4B6713T10111 T09340AW4g89&1 B^7280AA356003 AV\^1520AW875331 AA580720 
AW875336 BE276873 BE408229 AW188148 BE255166 6E253761 AW793727 AW373141 AW581548 AA471223 AA30595O 6E263976 AA626B20 
BE267409 AW360962 AA090655 C00312 BE312741 BE407213 AA209352 AW298199 AW248553 AW297794 AW731722 BE3a0586 AW731972 
AW615446 BE301599 AW615520 AA486714 AW440257 AA196516 AA564630 AA61 8079 AW192592 AW474985 AA604580 AI627461 AA765440 
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Tables 2A-8C were previously filed on November 9, 2001 in USSN 60/339,245 (18S01-004100US) 

TaUe 2A shows 504 genes down-regulated in lung tumors relative to nonnal lung and ctnonteally diseased iung. Chronicaliy diseased lung samples represent chroniG non- 
malignant lung diseases such as fibrosis^ eniphysenia, and bionchifis. Tiiese genes were selected from 59680 protiesels on the &>sA%netrix Hu03 Genechip array. Gene 
expression data Ibr eacti pntieset olitained from this analysis was expressed as average intensity (Ai). a nomiafized value reflecfing the leialive level of mRNA expression. 

PItey: Unique Bos probeset identifier numljer . . 

ExAccn: Exemplar Accession numtier. Genbark accession numl)er 

UnigenelO: Unlgene number 

UnigeneTiUe: Unigenegenefifle 
Rl: 
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R6: 

R7: 
R8: 



90I}| perceniiie of Ai for nonnal iung samples divided by ttie BOth percentile of Ai for adenocarchoma and squamous celi carcinoma lung tumor 
samples. 

median of AI for normal tung samples divided i}y 90th percentile of Ai for adenocaidnonna and squamous cell carcinoma lung tumor samples, 
inedian of AI to nonnal hing samples mime the ISth peroenBle of AI far aD nom^ 

the 90th percenfla of AI for adenocarcinoma and squamous cell cambioma lung tumor samples minus the ISth perceniBe of At for all nomial 

lung, chronically diseased lung and tumor samples. 

average of AI for nom^ lung samples divided by average AI for squamous ceil carcinoma and adenocandnoma lung tumoc. 
median of A) for norma) lung samples divided by the 90th peroentile of AI for adenocaminomas. 

median of Al for normal lung samples minus the ISth peroBntte of AI for dl normal lung, chronically diseased lung and tumor samples divided by the 9dlh 
percentile of AI Ibr adenocarcinomas nriinus the ISth peicentile of AI for all nomia! lung, chronic^ diseased lung and tumor samples, 
average of AI for nonnal hmg samples divided by the 90lh percenOe of A) for squamous celi carcinomas. 

median of AI for nomia) lung samples minus the 15th peicentile of Al for all nomial iung, chronically diseased lung and tumor samples dh^ded tiy the 90th 
percentile of Al for squamous cell caidnomas minus the 15th percentile of Al for all norma) lung, chronically diseased lung and tumor samples. 
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ocuolBu HJt/.lctMololBg piuicui 1 


133535 


Ai ii^nin 


Ue 9R41Rn 


nrntivarihoriri 9 ff^ariHnrinJilm 9\ 

UtUU^UOUIlQIMl €»■ IvOUIIwl II ^ 


33537 


U41518 


Hs.74602 


aquaporin 1 (diannel-fonrang integral pr 


33856 


eE149455 


Hs.75415 


Aooession not fsted h Genbank 


33689 


NWL001872 


Hs.75572 


carboxypeptldase B2 (plasma) 


133779 


T58466 


Hs.222566 


ESTs 


33978 


AF035718 


Hs.78051 


trsnscripSofl factor 21 


33985 


L34557 


Hs.76146 


plalelet/endottKlial ceU adhesion molec 


34000 


AW175787 


Hs.334841 


selenium binding protein 1 


34111 


AI3725e0 


Hs^022 


TU3A protein 


134185 


AA285136 


HS^1914 


Homo sapiens mRNA; d3NA DKFZp586K1220 (f 


13^ 


AI873257 


K8.7994 


ESTs; WbaMy simBarto CGt«9 pnteto [ 
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134641 AIG92634 Hs.156114 

134677 AA2S1363 Hs.177711 

134745 Nli^OOOSSS Hs.89472 

134749 T28499 Hs.8g48S 

134786 T29618 

134825 U33749 

134978 AI829008 

135010 N50465 

135053 AW796190 Hs.9367B 

135081 AF069517 Hs.173993 

135091 AA493650 

135135 AA775910 

13S203 C15737 

135236 A)636208 

135266 R41179 

135346 NM-000926 HsJ92 

135378 AWg61816 H&24379 

135387 NR/LQ01972 H&99883 

135388 W27965 H&g9865 
135402 L1239B H&99g22 



HS.92S27 



H8.94367 

Hs.g5011 

Hs.269386 

HS.96S01 

Hs.g7393 



protein tyrosine phosphatase; non-recept 
ESTs 

angiotensin receptor IB 

cartnnicanhydraselV 

angiopoielin 1 receptor; TEK tyrosine ki 

thyrdd transcription fet^r 1 

ficoQn (oolaQen/Kirinogen donnin-oont 

ESTs 

ESTs 

RNA luRdSng motif prolebi 6 
ESTs 

synfrophln; beta 1 (dystrofdtiiHBSocialB 

ESTs 

ESTs 

Human ml^ tor KiAA0328 gene; partial cd 
phosphodpase A2; group IB (pancreas) 
potassiimi vdtage^ated channel; shaler- 
eiastase 2; neutrophil 
EST 

dopamine receptor D4 
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2BJ0 
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37.20 
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TABLE 2B shows the accession numbers for those primeirays lactdng unigeneiD's for Tat)le 2A. For each probesetvie have listed the gene duster number from which the 
oligonucleotides were destgned. Gene cluslere were compBedusiiig sequences derived from Gen^ These sequences were clustered based on sequence 

simBarity using Clustering and Alignment Tools (DoubleTwist. Oakland CaRfomia). The Genbank accession numbers for sequences comprising each duster are Bstsd in the 
'Accession' cdumn. 



Pkey. Unique Eos probeset hteniHier number 
CATnunber Genectusternumber 
Accession: Genbank accession numtoB 

Pkey CAT number Accessions 

108447 43452.-7 AA079126 

108550 120073.1 AA084657AA084996 

108655 127522.1 AA099980AA113013 

102397 44371.-1 041898 

126303 1525933.1 D76841 1378880 

125810 1554054.1 H00083 R81082 

103627 2615J Z48513 248512 

121366 280401.1 A1743515AA405617AW276706 

114609 116777J AA079505AAD7g537 

115272 172113J AWD15947AA211890AA279425 

108338 112186J AA070773AA070774 

108434 11401Z.1 AA07B899AA078782AA075788 

123802 genbanleAA620448 M620446 

102310 NOTJ=OUND_entre^U33839 

102636 entraLU670g2 U67092 

104776 genbarMA026349 

120504 genbankJ\A256837 AA^6837 

113502 genbariiLT89130T89130 

108499 genbanlLAA083103 AA083103 

101308 entrezJ.41390 L41390 

108629 genbanlUVA102425 AA10242S 

103098 221.215 M86361 Z26593 X028S0 D13070 AE0008S9 M17649 M87869 Ma7871 X61077 M16286 AR)18169 X61079 859351 }(60142 AF043169 

103241 enbaLX76223 X76223 

103508 entreOri0141 Y10141 

10^5 entre^256 Z26256 

119514 NaT.FOUND_entfez.W37937W37937 

121082 genbanLAA398722 AA39B722 

128634 AA464918.at AA464918 

105817 genbaidLAA397625 AA397825 

121518 0enbantLM412155 AA412155 

114449 genbanm^020736 AA020736 

114848 genbanl;JM101056 AA101056 

121950 genbanieAA42951S AM2951S 

107723 oenbantLM01S9S7 AAD15967 
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Table 3A shows 452 genes u|Hegulated in dironicaiiy diseased fung reiafive to nonnal lung. Qnonicaily diseased bing samples represent chronic non^gnant tung diseases 
such as fitfosis, emphysema, and brancMlls. These genes were selected IhrnSSeBOprotesets on the Cos/^^ Gene expression data for each 

prabeset obtahed flrom Oils analysis was expressed as average intensity (Al), a nonmdzed value reflecting the relaBve level of mFtNA expreesion. 

Pkey: Umque Eos probeset Idenfifjer number 

ExAccn: Exemplar Accession number. Genbank accession number 

UidgenetD: Un^ne number 

Ui^ene Title: UnigenegenefiQe 

R1: 80tli percentile of Al tiorchronically diseased lung samfrtes divided by the 90th percenGle of Al tbr nonnal lung samples. 

R2 801h percentile ofAlforchronlcallydlseased lung samples dMded by them 

adenocarcinomas 

R3: 70th percentile of Al for chronically diseased hing samples minus ttie 15lh peroenSte of Al for all nornial hmgi chronically diseased bmg and tumor samples 

divided the 90fh percenfib of nomid hing sarnpleSt squamous ceD carcinomas 
chronically diseased lung and tumor samptes 



Pkey 


BcAocn 


UnlgenelO 


UnigeneTllle 


R1 


135423 


U50531 


H3.13S751 


Human BRCA2 region, mRNA sequence CG030 


1Z40 


135378 


AWg61618 


Hs^4379 


MUM2 protein 




135346 


NM.000928 


Hs.992 


phosphoTipase A2, group IB (pancreas] 




13S235 


AW298244 


Ks.293507 


ESTs 


12.40 


135057 


U9Q266 


Hs.93810 


cerebrsl cavernous mdfoi mallons 1 


11.67 


134951 


6E305081 


Hs.169358 


hypothsiical proteffi 




134799 


M36d21 


Hs.89690 


GR03 oncogene 




134786 


T2S618 


Hs.89640 


TEK tyrosine kinase, endottiefiai (venous 




134772 


NiyL000829 


Hs.16^97 


^utarnaAe receptor, bnotropbic; MIPA 4 


29.80 


134752 


6E246762 


Hs.89499 


arachltfonrtg S-fipoxyggftgse 




134749 


T28499 


Hs.89485 


carl)oryc anhyckase IV 




134696 


BE326276 


Hs.8861 


ESTs 




134636 


Nm^005582 


H&8720S 


lymphocyte anl^en 64 (mouse) homolog, r 


13.60 


134627 


AI018768 


Hs.12482 


^iyoeronephosphfile 0*8C]WrBnsfusso 




134622 


AW975159 


HS.2930S7 


ESTs, VMdy similar to A55380 fadogeni 




134570 


U666t5 


Hs.t72280 


SWl/SNF lel^Bd, matrix assocblBtt ac6 


13.20 


134561 


U76421 


Hs.85302 


adenosine deamfaiase, RN^Vspedfic, B1 (h 




134468 


NML001772 


H5.83731 


C033 antigen (gp87) 




134417 


NiliL006416 


Ks.82921 


solute carrier family 35 (QMP-^dc ad 




134343 


050683 


Ks.82028 


bsnsforming growth fsctoft bsia racepto 




134323 


BE170651 


Hs.8700 


deleted in Gver cancer 1 




134300 


NML001430 


H5.8136 


endolhetiai PAS domsdn pmisSn 1 




134299 


AWS60939 


Ks.97199 


cofiQilement component C1(] rsoeptor 




134253 




Hs«8073B 


siaiophorfn ^1 1S» leukosbfin, C049) 


20.60 


134182 


P52059 


Hs>7972 


KIAA0871 protein 


12.20 


133985 


L34657 


Hs.78146 


platelet/endoth^ cell aJheson molec 




133978 


AF035718 


Ks.78061 


tiHnscif ption fflt/tor 21 




133835 


A1677897 


Hs.76640 


RGC32 protein 




133651 


A1301740 


Hs.173361 


diliydropyiiRddina5&4ike 2 




133S33 


D21262 


Hs.75337 


nucleolar and ooitetM?ody piiosphpiolflin 


15.20 


133565 


AW955776 


Hs.313500 


ESTs. Moderately similar to AUJTjtUMAN A 




133548 


AW946364 


Hs.178112 


DMA segment, angla copy probe LNS-CAIA. 




133488 


AA335295 


Hs.74120 


adipose spediic 2 




13347B 


XB3703 


KsJ1432 


CHT^Bflt? ard^frin repeat piolfiin 




133337 


AF085983 


Hs.293876 


ESTs 




133200 


AB037715 


Hs.183639 


hypothetica] prot^ FLJ10210 




133153 


AF070592 


Hs.66170 


IHSKM-B protein 


30.60 


133130 


AI12B606 


Ks.6557 


?inc fin06f prpte^n 161 


22.60 


133120 




Hs.65424 


tBtrsnGChn (^ssmlROQBiviifK&iQ protdn 








Hs.169449 




13.80 


132836 


AB023177 


H&29900 


K1AA0960 protein 




132799 


W73311 


Ks.169407 


SAC2 (suppr^sor of actin mutations 2, 


41.60 


132742 


AA025480 


Hs.292812 


ESTs. Weakly similar to T33468 hypotheti 


40.40 


132548 


X12830 


Hs.193400 


intsdeukin Orsceptof 




132476 


AL119844 


Hs.49476 


Homo sapiens done TUA8 Crf^iKhat regi 




132439 


AKD01942 


Hs.4863 


hypothelteal protein DKFZp566A1S24 




132240 


AB018324 


Hs.42676 


KIAA0781 protein 


21.20 


132210 


NiyL007203 


Hs.42322 


A kinase (PRKA) anchor protein 2 




132199 


AL041289 


HS.1680B4 


ESTa 


15.20 


131751 


T96555 


Hs^1562 


ESTs 




131745 


Ai8285S9 


Hs^1447 


ESTs. Moderately simiar to A46010 X^l 


27.80 


131694 


NM.000246 


Hs^76 


MHCdassiltransactlvator 




131686 


NM.012296 


H&30687 


GRB2.assodaied binding proiein 2 




131676 


AI126821 


Hs^14 


ESTs 




131629 


Z45794 


Ks^09 


ESTs 


21.40 


131589 


C18825 


H&29191 


epithelial membrane protein 2 




131536 


AA0192O1 


Hs.26g210 


ESTs 




131517 


AB037769 


H&2633g5 


senna domain, transmembrane domain (TM), 




131355 


R52804 


H&25S56 


DKFZP564D206 protein 




131253 


R71802 


HS24853 


ESTs 


ISjOO 


131207 


AF104266 


Hs.24212 


iSTOptwrn 




131156 


AI472209 


Hs.323117 


ESTs 




131066 


AW169287 


Hs.22588 


ESTs 




131061 


N64328 


H&268744 


KIAA1798 protein 




131053 


AA348541 


Hs.296261 


guanine nudeoSde binding protein (G pr 




130895 


AA641767 


H&21015 


hypotheBcd piotdn DKFZp564LJ0864 simll 


16.60 


130762 


064371 


HS.189B 


paraoxonase 1 


12.00 



IS 



aoo 

&20 



R3 
2.13 



1.93 
2.07 



1.92 
1.92 

1.78 



&20 



a60 



1.77 
2.08 
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7.20 
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6l20 



ft40 
3.59 
4.48 



3.54 



1.88 
1.99 
1.76 



1.75 
1.84 



1.93 
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130657 AW337575 K&201591 

130655 AI831962 
AL1 10226 

30562 D50402 

130555 R69743 

I3036S W56119 

130273 AWg72422 

302S9 NM.000328 

30090 Hg7878 

29958 R27496 

129898 A1672731 

129675 M161018 

129699 AB007899 

29626 F13272 

29598 N30436 
AI33824 7 
X77777 

129527 AA76S221 

129402 W72062 

129385 AA172106 

129315 NM.014563 

29312 T97579 

29240 AA361258 

23210 AL039940 

29122 AW958473 



29057 

28946 Y13153 

28798 AF01SS25 

2BrB9 AW368576 

28778 AAS04776 

128766 AW160432 

28631 R44238 

28624 BE154765 

09 NM.003616 

28603 NII/L004915 

98 AA305407 

28458 H55864 

128061 AF150882 

27968 AA830201 

27959 A1302471 

127944 A1557081 

127925 AAfiOSISI 

27896 AI669586 

27659 AA7618Q2 

27817 AA836641 

127742 AW293496 

127628 AI2401Q2 

27609 X80031 

127582 AA908954 

27543 AKD00787 

27535 AA568424 

127404 AI379920 

127396 L31988 

127374 AA442797 

127346 AA203616 

127340 BE047653 

27307 AW952712 

27242 AW390395 

27167 AA62S6gO 

27046 AA321948 

26928 AA480902 

26900 AF137386 

26852 AA399981 

26816 AA248234 

126812 AB037860 



26545 AA316181 

26592 A16111S3 

26556 AF255303 

!26433 AA325606 

26299 AW9791S5 

26218 AUMgSOl 

26182 AA721331 

26177 AW752782 

126142 H86261 

26077 M78772 

25994 Ai99052g 

25934 AA193325 

25847 AW161885 

125831 H04043 

125731 R61771 

125676 BE61^18 

125561 F1B572 

125552 H09701 

125489 H4gi93 



ESTs 

cystein&fich protein 1 (intestinal) 
DKFZP434H204 protein 
solute carrier famify 11 (proton-coupled 
iniegrin, alpha 1 

eukaryolic transialion Initialioii factor 
MAD (mothers against decapentapiegic, Dr 
retinitis pigmentosa GTPase regulator 
zinc finger protein 36 (KOX 18) 
annexinA3 
ESTs 

hypotftetteal protein F(J13920 
tumiolog of yeast ubiquifin-protein Bgas 
fisfritinjiglit polypeptide 
Homo sainens cDNA FLJ1 2566 Ss, clone NT 
Homo sapiens mRNA;cDMA DKFZp586L0120(f 
vasoadiwe IntBstinal peptide receptor 1 
d8lld4ul)iiiln 
ESTs 

Rag C protein 

spondyloepiphysedl dysplasia, late 
EST8,WeaMysiinnartol78885ser1neAh 
Interleukin 7 receptor 
KIAA1102 protein 

nudix (nucleostde dlptiosphate United moi 
CDW52 antigen (CAtUIPATH-l anfigen) 
t^nurenine S^nonooxygenase (i^nurenine 3 
diemoicine (&€ motif) recepbr-like 2 
caveo}ffi2 

ESTs. Weakiy similar to 138022 hypolhet 
oanioSadal devetopment protein 1 
KIAA1080 protein; Gotgi-associated, ganni 
EST8.WeaidyslmnartoTRHY.KUMANTRICH 
survival of motor neuron protein interac 
ATPWmg cassette, 8ul)-1iamily G (WHIT 
potassium inwardiy-rectilylng channel, s 
ESTs 

sodium channel, vdtage^ated. type XII, 
ESTs 

Homo sapiens cDt^ FU23123 lis, cbne L 
S-adenosytmetluonine decarboxylase 1 
mllDgen-acfivated protein tdnase ionass 
ESTs 
ESTs 
^STs 
ESTs 

NDRGMy,member4 
collagen, type IV, alpha 3 (Goodpasture 
ESTs 

Homo 52^nscDl\iA FLI2(r780 ft, done CO 
ESTs 
ESTs 

DKFZP564A122 protein 
ESTs, Weakly similar to 138022 hypothet 
DnaJ (Hsp40) homolog, subfamfly B, membe 
ESTs. WeaMy similar Id ZN9UHUMAN ZINC 
ESTs, Weakly simUar to AF191020 1 E2IG5 
cathepsin S 
ESTs 
ESTs 
ESTs 
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Hs.17409 
H5.16441 
Hs.182611 
Hs.116774 
Hs.155103 
Hs.153863 
Hs.153614 
Hs.132390 
Hs.1378 
Hs.13256 
HS.130K 
Hs.12017 
Hs.111334 
Hs.11556 
Hs.98314 
Hs.198726 
HS.270B47 
Hs.11112 
Hs.1 10950 
Hs.174038 
Hs.110334 
Hs^37868 
Hs.202949 
Hs.30ig57 
HS276770 
HS.107318 
HS.3Q2043 
Hs.139851 
Hs.186709 
Hs.298460 
HS.15S546 
Hs.102647 
HS.1024S6 
Hs.10237 
Hs.102308 
HS.S6340 
Hs.186877 
Hs.124347 
Hs.124292 
Hs.262476 
Ks,3628 
Hs^194 
Hs^1559 
Hs.163085 
Hs.teQ138 
Hs.322430 
Hs^ 
Hs.130844 
Hs.157392 
Hs.164450 
H8^Q224 
Hs.187991 
Hs.312110 
Hs.44896 
Hs.119183 
Hs.125712 
Hs.161301 
Hs.ig0272 
Hs.293968 
Hs.137401 
Hs.12701 

gb:zu88c01 fi Soare8.testts_NHr Homo sap 
gbxsg2228^.F Human fetal heart. Lamb 

Hs.173933 nuclear factor UA 

Hs.151999 ESTs 

Hs.61635 six transmembrane epithelial antigen of 
Hs.6093 Homo sapiens cONA:FLJ22783tls,ckineK 
Hs, 1 12227 membrane-associated nudetc acid binifing 

gb:EST28707 Cerebellum II Homo saplms o 
Hs.2g8275 amino acid transporter 2 
Hs.1 3649 Novel human gene mapping to chomosome 13 
Hs.293771 ESTs 

Hs.129750 hypolhefical protein FU10546 
Hs.40568 ESTs 
Hs.210836 ESTs 
Hs,270799 ESTs 

Hs.32646 hypothetfced protein FU21901 
Hs.249034 ESTs 

gb9j45c03x1 Soares ptacenfa Nb2HP Homo 
H&.26912 ESTs 

Hs.151973 hypothetical protein FU2351 1 
Hs^978 ESTs, Weakly similar to ALU4JtUMAN ALU S 
Hs^78366 ESTs.Wlsakiy5tmilar(Dl38022hypotheti 
H&.124884 EST6.1Mbderate(y8lmilartoAUJ7.HUMANA 



11.60 
21.20 
16.60 
22.63 

39^ 

15.20 
1Z40 
20.83 



12.20 
26.40 



16J)0 
12.80 



17.20 
21.30 

10.60 
13.40 

14.00 
14.00 
11.00 
11.10 

19.60 
15.40 
17.50 
14.60 
15.40 
14.60 
21.00 
15.80 

22.60 
21.40 
41.20 
11.00 



1Z20 
17.19 
13J7 
1&40 

laoo 

16,77 
14.60 

13.40 
18.20 
14.00 
16.59 
17.40 
13.00 
49.57 

13.20 
11.20 

.12.60 
33>I0 



9.60 
6.60 

5.05 



2.08 
1.91 



1.91 



4.20 
&20 



2JSZ 
Z11 

1.95 

^24 



1.78 
2.51 



4.00 



7.00 



5.60 



4.67 



3.50 



1,78 
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125167 AL137540 Hs.102541 

125139 AW194933 

125042 776906 

124711 NM.004657 

124631 NM.014053 Hs^94 

124578 tm2\ Hs.231500 

124574 AL036596 

124472 N52517 

124438 BE178536 

124357 N22401 

124306 AW973078 

124214 H58608 

124097 AW298235 

123976 T89832 

123972 T46848 

123961 AL050184 

123936 NM.004673 H8^1619 

123802 AA620448 

123734 AA609861 

123619 AA602964 

123596 AA421130 

123476 AA3B4564 

123340 AA504264 

123190 AA4B9212 

123136 AW4519g9 

123073 AA485061 

123055 AA482005 

122699 AA456130 

122679 AA811286 

122633 NAL001546 H8.34853 

122553 AA451884 Hs,190121 

122544 AW973253 

122485 AAS24547 

122211 AA30090O 

122127 AW207175 

122011 AA431082 

121992 A1860775 
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Hs^9432 
Hsi6S30 



Ks.42322 
Hs.102670 
H8.11Q80 

H8^303g 

Hs.151323 

Hs.101689 

Hs.170278 

H8.7D337 

Hs^eiO 



Hs.312447 

Hs.1 12640 
H5.108829 
Hs.182937 
HS.10S22B 
Hs.194024 
Hs.105652 
Hs.105102 
H5.301721 
H8.192837 



Hs.292689 
HS.16031B 
H8.g8849 
Hs.106771 



121351 

121314 W07343 

121242 AA400857 

121059 AA393283 

120934 AA226198 

120755 AA312934 

120637 AA811804 

120484 AA2S3170 

120336 N85765 

120266 AI807284 

120132 W57554 

120041 AA830Ba2 

119996 W88996 

119970 AA767718 

119861 W78816 

119824 W74536 

119740 AW021407 

119271 AI061116 

119221 C14322 

119126 R4517S 

119073 BE245360 

118928 AA312799 

118901 AW292577 

118661 AL137554 

118607 AI377444 

118449 AI813865 

118418 N66028 

118379 N64491 

118329 N63520 

118320 N63451 

116253 AA497044 

118124 N56968 

118056 AB037748 

118032 N52802 

117840 T26379 

117404 N39726 

117314 N32498 



Hs.g8506 
Hs.193784 
Hs.300670 
Hs.178096 
H8.110286 
Hs.193767 
Hs.98175 
H8.126065 
H5.97901 
Hs.287727 
H5.182536 
HS.97S09 



Hs.190745 

Hs.96473 

HS.1B1165 

Hs.205442 

Hs.125019 

K5.59368 

Hs.59134 

HS33S81 

Hs.4g943 

Hs.184 

Hs.21068 

H$.65328 

H8.250700 

HS.1171B3 

Hs.279477 

H5.283689 

Hs.94445 

Hs.49927 

HS.S4245 

Hs.164478 

Hs.49t05 

Hs.48990 

H^.141600 

Hs.20887 

Hs,46707 

Hs.42768 

Hs.47544 

Hs.48802 

Hs.15220 

Hs.42829 



33.60 
10.93 

11.20 
14^ 

313 



14.40 



15.40 



ESTs 

ESTs 38.00 
hypolhettcal protein FU13456 18.20 
nslrin 4 

hypothetical protein M6C1 0924 similar to 
ESTs. Moderately sindlartoALUI.HUMAN 21i0 
senim deprivalion response {phosphafidlyl 
FLVCRpiolsIn 233 
EST 21^ 
A idhase (PRM) anchorpnolafn 2 
EST 373 
membran&fipannbig 4danains, subMy A 
gtKyw37g07.8lMortDA Fetal OocNea Homo 14.64 
ESTs 
ESTs 
ESTs 
ESTs 

{mnninaglobulin superfandy, member 4 
DKFZP434B203pn)tBln 
angiopdetin-Gke 1 

giKae58o09.s1 Stratagane lung carcinoma 
ESTs 

gtj:nti97c02.s1 NCLCGAP_Pi2 Homo sapiens 
EST 
ESTs 

peptldylpralyl isomerase A (cydophllin 
EST 
ESTs 
ESTs 

ESTs. Weakly similar to reveise Iranscri 
KIAA1255pn)te)n 

ESTs. WeaMysMlartoALJUS^MAN ALUS 
inhibitor or DMA bindino 4. dominant neg 
ESTs 
ESTs 

FXYD domaiiH»ntaining ion transport leg 
ESTs, IModeiatBlysimaartoAFieiSI1 1 H 
ESTs 

gb2w78a10.s1 Soares.testlsJlHT Homo sap 
ESTs 

Homo sapiens mRNA; c DNA DKFZp586K1 922 (f 
KAA1204 protein 

angiotensin I converting enzyme (peptidy 12.43 

ESTs 

ESTs 

EST 14.00 
ESTs 

EST 113 
hypothetical protein FU23132 123 
phospholipid scismblass 4 

ESTs 22.40 

gb2t74e03Jl Soares.tBStis.NHT Homo sap 143 

gb3tG26a07.$1 NCI CGAP^Prl Homo sapiens 213 

Homo sapiens cDNA: FU21326 lis, done 

Sb:ob39a05.s1 l4a_CGAP.GCB1 Homosapiens 20.00 

EST 403 

eidcaiyotic translaSon elongafion fiactor 

ESTs. Weakly similar to T34036 hypotheli 16.80 

ESTs 

ESTs 

EST 

liypothetical protein FU10512 113 
ESTs. Weak^ similar to S65657 alpha-IC- 
advanced glycosylaiion end product-sped 
hypotheffcatpnotsin 20.20 
Fanconi anemia, complementation group F 15.20 
tryptase beta 1 

ESTs 1^60 
ESTs 

activator of Cl^ in testis 
ESTs 

protdn kinase NYI>SP15 

ESTs.WeaklyslmIiartoS6S824rev8f8St ia40 
hypothetical protdn FU21939Mlar to 
FKBP-assoclaied protein 16.20 
ESTs 

gb9y62i01.8l Soares.mulfiple.sclerosis. 

ESTs. Weakly sfmilar to altemativeiy s 

hypothettea) protein FU10392 17.60 

chromosome 21 open reading frame 37 143 

hypothetical protein DKFZp761O0113 

•EST 

Homo sapiens done 23632 mf^ sequence 
zbic finger pcDteln 106 

ESTs 143 



10.60 



4.00 

273 

&00 

15.60 
4.23 
4.20 



7.00 



4.80 
5.00 



1Z10 



3.60 



16.40 



1.60 



1.95 
1.84 



1.77 



2.03 
1.79 



Z18 



6.60 
473 
7.20 
3.78 



10.00 
196 

aeo 



400 
&60 

aeo 



SiOO 
400 



1.81 
1.95 



2.01 
1.85 



1.82 



1,79 



1.75 



1.90 

1.86 
1.90 



104 



wo 02/086443 



PCT/US02/12476 



5 

10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



11/ cU? 


u/mmi 

WUJUll 


ns.ouoooi 


■vid 1 rU4j pruiBin 








il/Uzo 


AWU/IKII 


nS.lU4419 


Unmfs eaniane mPMA* /*nMA nin:7nQKMni91 H 

namo sapiens mruMAi cuiv\ ui\r£po(niiwi£i \i 






9^1 


110o»4 


UKI\a.VLA 

nOUoo4 




0b!yp86a1O.s1 Sostbs fetsf Bvsr sptosn 


on ort 






110/04 


ADflflTOTO 
ADUU/tlf9 


Ue ^niORI 

n5.ouizoi 


UnmA e4ffUAne mDMA r\\ f a wrt ^e rttwa 1 eno^tfl^ 

noiiiO sapiDiis innivv cniuuiosonio i specinc 




4.0I' 




110/00 


/UoUODO/ 


Up Qcnorr 


CO IS 


1ft 9n 






116712 


AWWlolo 


nS.bl9ao 


UMmneaptlAne mDMA* aHMA rMrC7n7ftim71 ffr 

nOrnO SSpiSFiS inrVvV CLmMA UAtZP/DIIU/ I (IT 




RDA 




llOrU/ 


nlUJ44 


Uc AOflRn 
rlS.49U0U 


CO I6i weaKiy ouniiar id a unain f\ numan 


1A Rn 






110)901 


Al 149C09 


Ue fiORni 


eifftttl^^ Ia mm leA Yml / ^limO nmlain 

suniiariD niDUSB mu 1 1 unrn^ prmBin 


I9>4U 






llD£f9 


nns /1Z40 


Ue OQiono 


Co 1 Si WBdKiy BUinlar ID ALUl jnURWI ALU O 








IIOIDO 


Al n^Q<un 


Ue OnOOAQ 


KIAA11/I0 nmtatn 

iMAAi iw prutein 






2.13 


1 IDiOZ 


MLU4U0ZI 


Uc 1R99n 


£inc ungBr piijieui luo 






l.re 


lion / 


RCf^i^xm 

DOD1041U 


n5.oio/o 


ocliDo, snQupiosnnc reocuiuni vansiaMjii 


1^ 90 






11D lU/ 


Al I^QIR 
nLlOOvlO 


Ue 170R70 

ns. 1 /£0/£ 


hijw\ihalh^l nmtcsm Cf lOHHOO 

nypoinoucai puiHn ruizuu^o 


vw< 1 1 






113900 


AAAniTQO 


Ue 1790^ 


hwnnlhiiltngl rwMntn CI 1inQ7n 






2.36 


iioSoo 


Ar^ooolo 


Up AAion 
nS.44l3b 


intracGliular iTi6fnbran&-a5SociatBd caldu 


in on 






110044 


AIQT^nCO 
/\IO/OU0£ 


Ue OOOQO0 


nypoineuCoi praieui Muuooru 


10.0/ 






115683 


At 209910 


nS.oAooU 


junctional adhesion ntoi^njie 2 




97 AA 




115673 


AAi40d341 


U» OeOOfYQ 


Homo sapiens cona rui iswi iiSi cione nc 


11 DO 
11. b£ 






115672 


Aioo91lU 


nS.732o1 


coTS 


in en 
lU>bU 






115566 


AI144330 


H5.43977 


U(iM(M flAlA J JUL lu niL fjimii. jJjLn.li DD41 10fSA14 

Human dna sequence injm cRine Kri i-i9bNi 






1 7R 
1./D 


115313 


A Aononni 
AAoOoODI 


Ue lOAAII 

nS.1 04411 


albumin 


OR on 






1lO£f 9 


AUU0ftAflQ7 


Ue oonfios 


CQTe 
CO IS 




RAA 
O.UU 




IIOZOU 


AA07n^nA 
AnZruJUU 


Ue 10AOOO 


Unmn caniAneAnMA* CI 100100 fie ^Afia i 






1 RH 
I.OU 


llOlIU 


Al^nni £71 
AlwUlO/l 


Ue lion? 
n$>iioor 


IMAAI 400 praiein 


iA9n 






ii4yjw 


□CZ40401 


Ue J170CA 


CQTe 

cols 








1 I^nU 




Ue 1RP717 

ns<ioo/ 1 r 


CO IS 




5.60 




11^099 
114i}ZZ 




Ue 07AQI 
Tu.Df 49l 


Cola 




IfiO 




1 mooi 




nSt 1 ODD90 


ESTs 


43.70 






ii4foy 


/v\14gU0U 


Ue OQAinn 


Co IS 


11 nn 

1 1.W 






114f01 


/v\1 40/01 


Uc 19fi9fin 


hwnolhoKral nm4oin Pi I900QO 


14X0 






114/00 


AIAin^7 
/\IDlUo4/ 


Ue inoflio 


Co 1 s, RNOGeraiBiy siinuar lo ALUi_nuiviAn a 




A9A 
4.CU 




1 14030 


AA^imeo 
AAJiUioz 


ns.ib3z4o 


cytochromec 


in 71 

1U./1 






11 ARIA 
114010 


AU/1A90A7 
nWlOOZDf 


nS>1U040g 


eimnmeeAf o^uorl /C ^Araulei4A\ 0-filra 

suppressor of van (OiCeiovisiaBj oniiw 


&U.4U 






114400 


ru/sUb 


Ue 071C1C 

ns.z/1b1b 


CO 1 8, waany soiuiar u ALUo^nuMAn alu o 


cU.4U 






11440* 


AI^Q07C 


Ue oAonin 


nuino sapiens cuda ru 14440 wS| gnhib nc 




17 9A 
■ /.All 




114359 


■kit 1 i\« CflOO 

NM_016929 


HS.26W21 


cnionoB intraoeitUBr cnannei 0 






0 no 


114357 


K4I0/7 


nS.b1Uf 


rnMno sapiens guna rLji4od9 nSi cone uv 


49 An 






114251 


HI 5261 


Hs^194o 


ESTs 






o/m 


ll4lJ0 




Ue 1C7An 
n5.10/4U 


UftmA eoniAne »nDKIA< rtRMA nt^C7nA0APA00 /fr 

nuDiO saptens mrxNA, clmma uisr£p40'tcuoo (ir 




11 AA 
1 1.4U 




114124 


V\ra7554 


Ua 4 0cn4Q 

nS.1 25019 


c5T8 




b.U4 




11 JjmO 


AWUoOoOO 


Ue 07i)0R 


UArrus eontane i^HKIA CI HOKIfl fie Mnna Dl 

noiTiO sapiens cuiv\ ruiiooiu nS) cione rx 






1 ft9 

(.Oil 


113695 


TDfiOAX: 


Ue 17IMO 


CCTe M^balrhr nirwnHar 1^ Al 1 IS Ul lUAM till 

Co 1 s, weaKiy similar id alud^uiwuv uu 








113606 


riML.Ui J34J 


Ue OTflORI 

r1S./^rbsOl 


NA6-7 prctem 






9 1R 
A.10 


llOOsU 


DAQR^O 
K4*H)4Z 


Ue 1A0A/7 
riS.14Z44/ 


PQTe WoabK/eimilar Vk Al 111 UIIMAMAIIIfi 

Co 1 S| weaKiy suniiar ud alui jiuMAn alu o 




^ RA 




113560 


T(M/\1C 
ItflUlO 


Ue OCQCOC 

nS.zboDZb 


CCTe 
CO IS 


09 nn 






IIJOD* 


AID04ZZJ 


Ue lenoe 


Kmitfitliftltr^Ql nmlain CI 100101 

nypouiBucai pi worn r lj 1 9 1 








113540 


AIRM i:oc4 0 
AWloZOlO 


n8.lD7D7 


ESTs 








iiooUz 


IbsloU 




nK.ttm40flA4 el CfMta^vAna limtfi /OOTOIHV U 

gD^i^ouiaSi ouaiagenc lung yaiciv) n 




0.00 




IIJ^OO 


Ain7fia9fi 


Ue 10Qft7 


CCTe 
Co IS 


19 AA 

l£**IU 






11OZ04 


IVIVL.UU440S 


Ue 11000 


Jt\D \w\A\tr*A^ nwiuilli fafflfif AjBttMiiar an 
C-n03 uUJUCai giuWin laUKN yVBSGwer mi 




427 




113238 


R45467 


HS.109O10 


ESTs 








113203 


AA743563 


Hs. 10305 


ESTs 


04 on 






lldlSO 


UQQoec 
noo^iK) 


Ue fififil 

ns.booi 


Co 1 S, Wadiay SunilBi u> 041U44 Ciuu4iiua0in 






1.92 


113089 


T40707 


nS^uo62 


ESTs 








lltSU/O 


ACm^lOQ 
nrUiMlBs 


Ue 01 08 


zuic nngoT piuKui <U4 




6.00 




lioDDS 


TOO con 
12Jb99 


U(< VOAC 

nS.7Z4b 


ESTs 




Q An 

9.4U 






Aica/9on 
Al0a4a£U 


Ue eooc 


CCTe IA/aqUu elmiTar Ia T170Aa huAAHioll 

Co 1 s, weaioy ssniiar id 1 1 / z4o nypouieu 




19 9A 




1 10001 

1l£09l 


IUJ94/ 


Ue 0001 A7 


CCTe MArferolalw etmRor in AACAin yjt 
Co 1 S| MuQeloiBiy SunUBr U9 /%40U1U Ail 


in ^ 






112794 






0O^/4OUo.si oosres leisi iwerspieeii 


26.60 






IIOCQI 


R88708 


Ue OOnRA7 

nSui2Ub4r 


CCTe 

colS 


IR 0.0 
10.00 






nocno 
1i/d02 


AWUU4U40 


Ue onooee 


CCTe 
col 8 


1R cn 






iiOQce 

11 £000 


ACftOCOIB 
ATUOOOIO 


Ue 10C00 


Homo sapiens ckine 23705 mRNAseciuence 


If; AA 

10.4U 








R49645 


Ue TnnA 
nS.7UU4 


ESTs 


1A /tn 

14.UU 






112064 


Al /\AOOQn 


Ue OOCOQ 


nomo sapiens rnww, cum uiU'zpaooui jio p 


10 nn 






111998 


R42379 


Ue 40flOQO 


ESTs 


11 nn 
11. uu 






111987 


Kill n<ieo4n 

NIVLUl531v 


nS.b/o3 


t/i A Ann J') ii,i,,i„iVi, 

KIAa0942 proran 


oo An 






111803 


AACQO^^ 

AAOsoTol 


Lie OOCQOO 


CCTx \Jl. .AaMtAsXj jJmlLji In Al 1 IE Ul IMAAI A 

cbis, nftooersiey Similar ID AWtJLnUMAN A 






1 77 
1.// 


111737 


H04OU7 


ns.9216 


ESTs 






4 Oft 
l.bb 


111605 


T91081 


HS.i 94178 


coTs, Mooerateiy similar to ry^koes wm 


ZoAJU 






lllOlU 


KU/bob 


Ue •leocc 


ESTs 


14 no 






111341 


AL1 57484 


Hs72483 


Homo saptens mlW/^ cONA uKFZp7B2M1 27 (tr 






1.0D 


111280 


AA373527 


Ue lOODC 


tA^ob protein 


IB AA 






111247 


AWu533oQ 


H&16762 


Homo sapiens mRNA; cDNA uKrZpoo4B20b^ \t 








iiiaQO 
11 iZm 


AIOAT7eO 
AiZ4f /Da 


Ue ICQOa 

nS.1b3Zb 


ESTs 


97 Rfl 






110942 


r<ooo(j3 


Hs.28419 


ESTs 


14.80 






110924 


MV*U3O*i0J 




TmfJRnnorts an#) fwimanKnyBe 1 


24.71 






110837 


H03109 


HS.10B920 


HT018 protein 






Z18 


110824 


Ar767183 


Hs.26942 


ESTs 


12^ 






110776 


AB032417 


Hs.19545 


frizzled (Drosophila) homolog 4 






175 


110576 


H60869 


Hs.37889 


ESTs 


13jOO 






110369 


AK000768 


HS.107B72 


hypothetical proton FU^761 




5.60 




110099 


R44557 


Hs.23748 


ESTs 






Z31 


109984 


AI796320 


H5.IO299 


Homo sapiens cONA aJ13545 lis, done PL 








109958 


AA001266 


Hs.133521 


ESTS 


11.25 






109893 


AA884208 


Hs.30484 


ESTs 






2.68 



105 



wo 02/086443 



PCT/US02/12476 



5 
10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



09842 


AW818436 


Hsi3590 


solute carrier family 16 (monocarbaxylic 


23^ 


09837 


H00656 


Hs29792 


ESTs, Weali^siniilarto 138022 IgrpoQtetl 




09796 


A)800515 


H&12024 


ESTs 




09688 


R41900 


Hs.22245 


ESTs 


2180 


09648 


H17800 


Hs.7154 


ESTs 


09613 


H47315 


Hs.27519 


ESTs 




09560 


AW02i486 


Hsl6961 


ESTs 






AW193342 


Hs.24144 


ESTs 




09472 


AK001989 


Hs.91165 


hypothefica) protein 


15.00 


09355 


AAS24525 


H5.48297 


DKFZP586C1620 protein 


09260 


AW978515 


Hs.131915 


KiAA0883 protein 


25.60 


08781 


AA1 26654 




gbsnSSgOf^sl Stratagene fista) retina 93 


14J» 


08663 


GE219231 


H5.292653 


ESTs, We^ similar to T26845 liypoiheii 


IIjOO 


08573 






gb:zl84c04^1 Stratagene colon (937204) 


26^ 


08480 




Hs.68055 


hypo0)8(ical protein Di<FZp434t0428 




08382 


NM 006770 


Hs.67726 


macTopliage receptor wHti collagenous str 


1Sl20 


08174 




Hs.303070 


ESTs 


08138 


ALQ4999Q 


Hs.51515 


Homo sapiens mRI^ cDNA DKFZpS64G112 (lir 


15>14 


08087 


AA045708 


Hs.40545 


ESTs 


08048 


AI797341 


H5.165195 


Homo sapiens cDNA FU14237 lis, done NT 




08041 


AW2iy4712 


H&61957 


ESTs 




07997 


AL049176 


Hi82223 








nnlMPu 1 1 


Ks.48469 




14.20 


07922 


BE153B55 


Hs.61460 


Ig sup«fan4ly leoeptor UAR 


07681 


BE379594 


Hs.49136 


ESTs, Moderately simitar to ALU7J1UMANA 


51.80 


07666 


AA010611 


Hs.60418 


EST 


29^ 


07332 


T87750 


Hs.183297 


DKFZPS66F2124 protein 


ia73 


07292 


BE166479 


Hs.4789 


Homo sapiens serotojoaiiy defined breas 


32.00 


07230 


AI034467 


Hs.34650 


ESTs 


17.40 


107166 


W57578 


Hs.237955 


RAB7. member RAS oncogene famny 


10.43 


071 60 


AA314490 


Hs^7669 


KIAA1563 protein 


11.40 


107054 


AI076459 


Hs.15978 


KIAA1272 protein 


21.40 


107029 


AF26475D 


HS.2B8971 


myeioid/lymphoM or ndxed^neage teukem 


106999 


H93281 


Hs,10710 


liypolheticai protein FLJ^17 


35.80 


106954 


AF128847 


Hs^04038 


itidoiettiylamine M^ethyilransfsrase 




106870 


AI983730 


Hs.26530 


serum depriv^on response (phospliatldyi 


13^ 


106865 


AW192535 


Hs.19479 


ESTs 


108844 


AA485055 


H5.166213 


sperm assodated antigen 6 




106820 


MM 016831 


Hs.12592 


period (Drosophfla) Imnobg 3 


1100 


106818 


AK002135 


Hs.3542 


iiypotiietlcai protein FU11273 


106797 


JU768801 

Oil um/w 1 


H5.169943 


Homo sapiens cONA FLJ13S69 fis, dons PL 




106773 


AA478109 


Hs.188833 


ESTs 


1Z60 


106747 


NM 0071 IB 


Hs.171957 


ttlpte ^notional doman (PTPf^ interact 


106743 


BE6133^ 


Hs!21936 


hypoMcal protein FLJ12492 


meo 


108667 


AW360847 


Hs.16578 


ESTs 




106605 


AW772298 


Hs^1103 


Homo sapiens mRNA; cDNA DKFZp564B076 (fir 




105567 


AW450408 


H&86412 


diTomosome 9 open mifing t^ame S 




106562 


AL031846 


Hs.152151 


(dakoptilBn 4 




106536 


AA329646 


H&.23604 


ESTs, WeaWy simflar to PH0039 son3 prat 


23.20 


106533 


AL1 34708 


Hs.14^98 


ESTs 


106507 


AA259068 


Hs!267B19 


protein pliosfdiatase 1i regulatory Onhib 


1&20 


106490 




Hs 115537 

Vim* 1 1 vvw* 




10.44 


108474 


BE383668 


Hs.42484 


iiypotlieticai protein RJ10618 


106211 


AA428240 


Hs.1 26083 


ESTs 




105986 


AB037722 


Hs!8707 


KtAA1301 protein 




105894 


A)904740 


H5J25691 


receptor (calcitonin) activi^ modt^lng 




105847 


AV)ffi64490 


Hs.32241 


ESTs, Wealcly similar to S65657 alpte-IC- 




105803 


AW747996 


Hs.1 60999 


ESTs, lUloderateiy simiter to A56194 ttimn 


10.71 


105731 


AAB34664 


Hs^9131 


nudear receptof coacdvator 2 


105729 


H4^12 


Hs^93816 


Homo sapiens HSPC285 mlWA. partial ods 


23.40 


105688 


A)299139 


HS.17S17 


ESTs 


105510 


242047 


Hs.283978 


Homo sapiens PR02751 mRIMiCamiilebcds 


37.20 


105101 


H63202 


H5!38163 


ESTs 




104989 


R55996 


Hs.285243 


tiypotiieta'cai protein FU22029 




104986 


AWD88826 


Hs.1 17176 


poty(A)^inding protein, nudear 1 




104969 


AI670947 


Hs.78406 


ptiosphathtylinosltol-4-p}iosphate 54clnas 




104903 


AI436323 


Hs^1141 


Hbmo saptens niRNA farKIAA1568 paOedn, 


13.80 


104896 




Hs.23165 


ESTs 


104865 


T79340 


Hs!22575 


Homo sapiens cDNA: FU21042 lis, done C 




104825 




Hs.141883 


ESTs 




104781 


AA09d904 


Hs.21610 


DKFZP434B203 protein 




104776 
104691 


AA026349 
U29690 


H&37744 


gbzi99fiOl£l Soaresj)regnant„uterus_NbH 
Homo sapiens beta-l adrenergic receptor 




104667 


A1239923 


H&30098 


ESTs 




104404 


H58762 




gb£ST00057 HE6W Homo sapiens cDNA clone 


27iO 


104392 


AA076049 


Hs.274415 


Homo sapiens cDNA FU10229 (is, done HE 


104212 


AB0022g8 


Hs.173035 


KIAA0300 protein 




104074 


AL162039 


H&31422 


Homo sapiens mRNA; cONA DKFZp434H«229 (fr 


\m 


103749 


AL135301 


Hs.8768 


tiypotheUcai protein FU10849 


10J6 


103645 


AW246253 


Hs.7043 


succinate^ ligase. GOP-talng. alpha 


12.00 


103554 


AI878826 


H&323469 


caraoGn 1, caveolae protein. 22kO 




103541 


A)815601 


Hs.79197 


CDB3 antigen (activated B lymphocytes, 1 




103496 


Y09267 


H8.132B21 


flavin containing monooxygenase 2 


11.20 


103428 


BE383S07 


H8.78921 


Aidn8se(PRKA) anchor protein 1 


103353 


XB9399 


Hs.119274 


RAS p21 prot^ ai^vatar(6TFa8e adiva 


19.80 



17^ 

aeo 



&00 



3.91 



1.89 



1.83 



11.40 
4.76 



7.13 
7.00 



29.80 
3.70 



a30 
&09 

5.40 
7.60 



10.20 
5.69 
a62 
420 



1.76 



2.05 



2.40 
1.7B 
1.76 
Z19 



1.94 
1.7S 
Z47 



1.92 



1,87 



1.91 



1.80 
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103295 


X81479 


Hs.2375 


103280 


UB4722 


H8.76208 


103100 


MM 605574 


H8.16458S 


103025 


NM 002837 


Hs.123641 


102698 


M186B7 


Hs.1667 


102659 


BE245169 


HsJ211610 


102580 


U60608 


H8.152981 


102417 


AA034127 


Hs.153487 


102363 


NML003734 


Hs.1 98241 


102302 


AA306342 


Hs.69171 


102283 


AW161552 


Ks^81 


102188 




H8.76913 


102151 


TZ7013 


Hs.3132 


101957 


L28824 


H5.74101 


101842 


M93221 


Hs.75182 


101771 


NM 002432 


Hsl53B37 


101764 


AI198550 


Hs.81256 


101716 


AFflSflfiSS 




101678 


M62505 


Hs^lGI 


101447 


M21305 




101383 


NM 000132 


Ks.79345 


101346 


AI738616 


H8.77348 


101345 


NM Q05795 


Hs 152175 


101336 


NM_006732 


Hs.75678 


101330 


L43821 


Hb.80261 


101277 


BE297626 


Hs 296049 


101262 


L35854 




101168 


NM-005308 


Hs.211569 


101102 


NM 003243 


Hs 79059 


101088 


X7fl697 


Hs.553 


101066 




Hs.889 


100971 


BE379727 

DGWf vf &f 




100893 


BE245^ 




100770 


W9S797 cttim 


Hs 177486 


100716 


X89887 


Hs.172350 


100555 


M69181 




100425 


NliL014747 


Hs,78748 


100408 


D66640 


H6.56045 


100382 


D83407 


Hs.156007 


100351 


D64158 




100299 


D49493 


Hs.2171 


100134 


AA305746 


H5.49 


100108 


U0g677 


Hs.76873 


100095 


ZS7171 


H5.78454 


100066 







egMike module oonlaining. mudn-lilce, 
cadherin S, ^pe 2, VBcadherin (vascula 
UM domain only 2 (rhombolin^ 1) 
protein lyroslne phosphatase, receptor t 
progastricsin (pepsinogen C) 
CUG triplet repeat. RNA^ndlng protein 
CDP-diacytglycefDl synthase (phosphatida 
signal transducing adaptor molecule (SH3 
amine oxklsse, copper containing 3 (vase 
protein kinase C-like 2 
guanine nucleotide binding proteirt 11 
dwrnoMne (C-X3-C) receptor 1 
sieroidagenic acute regulatory protein 
spleen tyrosine kinase 
mannose receptor, C type 1 
nriyelnd cell nudea'diflierentiab'on ant 
SlOOealdum^dndlng protein A4 (calcium 
tadiyMrAi, precursor 1 (sutistance Ki su 
oomptement component 5 receptor 1 (C5a I 
^:Human alpha satellite and satellite 3 
ooagulatian ^tor Vlil, procoagdant go 
tqfdroogfprostaglandin datqfdregsnase 16^ 
calcitonin receptor-Uke 
FBJ murine osteosarcoma viral oncogene h 
enhancer of filamentalion 1 (cas^kedo 
microfibrillar-associated protein 4 
gtKHuman dystrophin (dp140) mi^ 5" end 
6 protein-coupled receptor kinase 5 
transfiorming growth fac^, beta recepto 
solute carrier family 6 (neurotransmitte 
C3harot>tjeyden crystal protein 
fst^ add Urxllng proidn 4, adlpot^fte 
S164prolab 

amytoid beta (A4) piecursor protein (pro 
HIR (histone cell c^e regulafion defec 
gb:Human nonmuscle myosin heavy chain-6 
KIAAQ237gan8 product 
src homology three (SH3) and cysteine ri 
Down syndrome critlcai region gene l-lk 

growth difSsrentiation factor 10 
inscrophage scavei^er receptor 1 
hyalumxiglucosandnidase 2 
rnyocQin, trabemlar mKhwork inducible 
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3^ 



11.00 
25^0 
14^0 

10.86 



16.40 
15.40 



16.80 
504.80 



mo 



19.38 

15.40 
11.20 
14.80 
33.00 
16j!0 



1.76 
2.15 



7.40 



31/)0 



7,52 



1.78 
122 

1.75 
2.24 

2.01 

1.91 



11^9 



4.00 
4.24 
&20 
21.20 



5.40 



1.79 



TABLE 3B shows flie accession numbers tor those prim^keys lacMng untgenelEys fbrTaUe 3A. For each probesetwe have Isted the gene cluster number from whteh the 
ollgonucleotktes were designed. Gene chisteiswerBconiplled using sequences derived fiomGenba^ These sequences were dusteied based on sequence 

sffnilartiy using QuMnQ and Aiiijnfflent Tools (OouUeTurisI, OaUand CaBComia). The Genbank aooessioo numbers Ibr sequences comprising each duster are Gsted in the 
' A cces sto n' column. 

Pkey: Unique Eos probeset idenlitier number 
CAT number Gene duster number 
Acoessbn: Genbank accession numbers 

Pkey CAT number Accessions ...r-..,^^.— ^r^.-r^ .-^-^ -w . 



123619 371681J 

126433 12714^1 

125831 1522905J 

126816 122973.1 

126852 136135J 

121059 273450J 

120637 200885.1 

122011 7617.-2 

120934 177521J 

123802 genbanKJ^620448 

116814 genbank_HS0834 

118329 genbanlLN63520 

104404 H58762_al H58762 

104776 genbank^AA026349 

113^2 genbanlLT89130T8913a 

101262 entre^lJ5854 L35854 

108573 genbankuAAOaeOOS 

101447 entrezJ421305 M21305 

124357 genbanK_N22401 

108781 genbankJkA128654 

112794 genbank.R97018 

100351 entrsiLP64158 D64158 

100555 tigr.>iT2245 M69181 M8110SU51O39 



AA602964AA609200 
AA32S606AA099517N89423 
H04043 060988 D60337 
AA248234AA090985 
AA39dd61AA128347 
AA393283 AA39862B 

AA81 1 804 AA809404 AA28G9a7 AWg77624 
AA431082 

AA226198AA228513AA383773 
AA620448 
H50834 
N63520 



AA026349 



AA066005 

N22401 

AA128654 

Rg7018 
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Table 4A shows 202 genes ujMBgulatBd in samples iiom pafients beaisd with chemolherapy or radtolherapy. These genes were selected from 59680 probeseb on the 
Eos/A%nietrixHu03Genechip array. Gens expression data for each profaeseiobtainad ton flus 
the relative level of mRNA expression. 

pkey: Unique Eos probeset identifier number 

ExAccn: , Ex^ar Accession number, Genbank accession number 

UnigenelD: Unigene number 

I hiinonn 'n!}f»* IJntnene Ofifta Hilfi 

lamoiherBpy or radiotlierapy dhrided by (he araraga of Al for norma) 
R1 

1 27.20 

aidototo reductase My 1 , member C3 20.60 
KIAA0042 gene product ■ 20.40 

glutamate receptor, metabotropic 5 20.60 

E2F franscrlplion fector 3 29.40 

topoisomerase (DMA) !l tiding protdn 23.50 

KIAA0874prot^ 35.56 

S164prot^ 43.40 

POUdom^.class3,tr8nscriptiQnfacto 21.80 

gb:Human a^ha satemte and satellite 3 193l60 

hepaiin^nding growth foctor binding pr 38.40 

bullous pemphigoid antigen 1 (23Q/240kD) 198.80 

desmogtein 3 (pemphigus vulgaris antigen 7^60 

gap junction protein, beta 2.26110 (conn 162.20 

nuclear autoanBgenlcapemi protein (his saoo 

cytosolic ovarian caidnoma antigen 1 26.00 

llDP-H«ce^pha^alactosamine9Qlyp 37.20 
mutS (E co^ homo!og 2 (coton cancer, 

RAR-related orphan receptor A 32.00 

ISL1 transcripBon factor, UM/homeodoma 51.20 

deoivguanosineklnase 13.90 

Homo sapiens cDNA:FU21800&. done H 2aB0 

preferentially expressed antigen In mela 1 10.60 

neurotensin 116.80 

enoiase 2, (gamma, neuronal) . 2.30 

matrix metaiioprateinase 1 (intersHGd 181.40 

seninVglucocorticoid regulated kinase 49.20 

5T4 onxfBtal tnsphoblast glycoprotein 88.60 

Homo sapiens mlVlA;cONADKFZpS64D016(fr 4Z60 

ESTs 29.40 

KIAA1488 protein 21.50 

hypothetical protein FU20287 32.80 

Homo sap^ns PR02751 mRNA. complete cds 20.20 

paired box gene 5 (B<ei] lineage specif 28.40 

downstream neighbor of SON 25.40 

ESTs. Weakly similar to 138022 hypothetl 32J)0 
Homo sapiens aim cOi^ DKFZ^76tGQ2121 ( 40.60 

ESTs 59.80 

ESTs 43.40 

phosphoserine aminotransferase 50J0 

gbdk04g09j(1 Na_C6AP_ljj24 Homo sapiens 53.40 

K1AA0922 protein 20.68 

DKI=ZPS66F2124 protein 23.60 
Homo sapiens mRNA: oONAOKFZp7626207(fr 57.20 

Igsupeffiamiiy receptor imR 49.00 

hypothetical proteh 19.67 

coliagen, type XVII, dphal 48.17 

RAB6 intera:ting, kinesbvUlo (rabUnes 59.20 

KIAA0863 protein 28.60 

hypotheficalpfatBhRJ104S3 22.80 
KiAA1702 protein 

ESTs 21.00 

trinucieotider8peatoontdbiinQ9 3l!60 

ESTs 24.20 

hypothetical protein MGC54B7 21.40 

ESTs 20-40 

kinesin family member 13A 19.60 

ESTs 24.00 

HMTI (hnRNP methyltransferase, 8. cerevl 28.40 

zin&A^ers and homeoboxes 1 36.00 
PDZ domain containing 1 .61.20 

hypothetical protein 24.60 

ESTs 27.20 

USlHntsracOng protein NUDE1, rat homo 48.00 

KIAA0942 protein 37.80 

CDC14 (ceH dhn'sion cycte 14. S. cerevl 26.80 

solute carrier family 6 (neurotransmitte 63.80 

ESTs.WeaIdy5lmflartoAUJ1JiUMANALU 26.40 

ESTs, Weakly similar to 155214 salivary 47.64 

ESTs 22.00 

hypotheBca! protein RJ1Q201 6Sm 
Homo sapiens mRNA; cONADKFZp761J1324(f 42X)0 

incrotubite-associatBd prolan IB 55.40 



r\i. 


average of Al for samples 


Pkey 


ExAccn 


UnigenelD 


iVUI Iw 


NJ^001269 


Hs.84746 


inniR7 


D17793 


Hs.78183 


infill n 


D26361 


Hs3104 


inn99<i 


D28539 


Hs.167t85 


IWU109 


NliL001949 


Hs.1189 


100436 


AA013051 


Hs.91417 


100877 


X80821 


Hs.27973 


100893 


BE245294 


Hs.180789 


101273 


Z11933 


Hs.182505 


i01447 


M2130S 




101649 


AW959908 


Hs.1690 


101724 


LI 1690 


Hs.620 


101748 


NM.001944 


Ks.1925 


lUlOv? 


MB6849 


Hs.323733 


101879 


AA176374 


Hs.243886 


101915 


Al^7681 


Hs.155185 


101973 


U41514 


Ks.e0120 


102025 


U04045 


Hs.78934 


102031 


U04898 


Hs.2156 


102052 


NRl.002202 


Hs.505 




AA296874 


Hs.77494 


102420 


U44060 


Hs.14427 




U65011 


Hs.30743 


\\JZO£S 


NlyL006163 


Hs.eo^ 


imnm 


NM.00ig75 


HS.146S80 


lUJUOD 


M13509 


HS.B3169 


MVVSl\7 
llUOUf 


AJ000512 


Hs.296323 


lUJOOf 


BE270266 


Hs.82128 


IIMDOU 


BE298665 


H5.14846 


1U409D 


AWD1531B 


Hs.23165 




AW503733 


Ks.9414 


llQZsd 


BE3&7790 


Hs.26389 


IU301U 


Z42047 


Hs.283978 


1U3D0r 


AA767526 


Hs.22030 


lUDUr J 


AL157441 


Hs.17834 




AW965058 


Hs.111583 


1UQ9ID 


AL137311 


Ks.234074 


1U09<M 


AL134708 


HS.14599B 




AW970602 


Hs.105421 




AW075485 


Hs.286049 


lUDOSI 


A1456623 






AB023139 


Hs.37892 




T87750 


H3.183297 




AA443473 


Hs.173884 


irr7Q99 


BE153855 


Hs.61460 


1UOOU9 


BE409857 


Hs.69499 


lUO/OU 


AU076442 


Hs.1 17938 


109166 


AA2ig691 


Hs.73625 




AW978515 


H5.131915 




AK001355 


Hs.27g610 




AW975746 


HS.16B662 




AA219172 


HS.86B49 




U80736 


KS.110B26 




AA232103 


Hs.189915 




AW967069 


Hs.211556 


109633 


AW003785 


Hs.170267 


109786 


Ai989482 


Hs.146286 


109958 


AA001266 


Hs.133521 


110920 


N47224 


Hs.20521 


110924 


AW05e463 


Hs.12940 


111084 


H44166 


Hs.15456 


111132 


AB037807 


Hs.83293 


111229 


AW389845 


HS.11Q855 


111337 


AAB37396 


Ks.263925 


111987 


NM_015310 


Hs.6763 


112046 


AA383343 


Ks.22116 


112268 


W3S609 


Hs.22003 


112685 


R87650 


Hs.33439 


112871 


All 10216 


KS.122B5 


112897 


AW206453 


Hs.3782 


112973 


AB033023 


H8.318127 


112992 


AL157425 


Hs.133315 


113073 


N39342 


Hs.103042 
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5 
10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



113494 
113S60 
113849 
113950 
114339 
114365 
114455 
114518 
114824 
114837 
114974 
115075 
115084 
115291 
115313 
115597 
115909 
116090 
116107 
116399 
117099 
117881 
118091 
118138 
118720 
118873 
119126 
119717 
119940 
120266 
120515 



T91451 

T91015 

AA457211 

AI267652 

AA782845 

H4216g 

H37908 

AW163267 

AA960961 

BE244930 

AW966931 

AA814043 

BE383668 

BE545072 

AA808001 

D31382 

AW872527 

A1591147 

AL133916 

AA889120 
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AF161470 

AW0050&4 

AA374756 

N73515 

AI824009 

R4517$ 

AA918317 

AL050097 

A1807264 

AA258356 



120983 
121054 
121369 
122335 
122612 
123130 
123440 
123596 
123619 
124006 
124169 
124281 
124472 
124617 
124631 
124839 
125186 
125321 
125535 
125646 
125684 
125724 
125847 
125934 
126077 



AA398209 

AW976570 

AW450737 

AA4432S8 

AA974832 

AA487200 

AI733692 

AA421130 

AA602S64 

AI1471S5 

BE079334 

AI333756 

N52517 

AW628168 

NM.O14053 

R5S784 

AA610620 



126395 
126433 
126509 
126538 
126666 
126812 
126872 
127045 
127431 
127489 
127521 
127742 
127925 
127930 
127968 
127987 
128116 
128609 
128777 



NM-013243 

AA628962 

AW589427 

AL360igO 

AW161885 

AA19332S 

iyi78772 

AW979155 

AI46B004 

AA325606 

R47400 



129168 
129404 
129527 
129574 
129598 
129765 



AA648886 

AB037860 

AW450979 

AA321948 

AW771958 

AA650250 

AW297206 

AW2934g6 

AA805151 

AA809672 

AA830201 

AI022103 

H07103 

NhL003616 

m7B9\8 

AA009647 

AI13298e 

AI267700 

AA769221 

AA026815 

N30436 

H19006 

AV65S806 



nS.ODO0O 


ESTs 


22.60 




CO IS 


2280 


Lis naco 


bromodoman adjacent to zinc linger dome 


51.80 


nS.oUoU4 


HtMno sapiens mRNA; cDNA DKF^)434^o2 or 


28.20 


Ue 00700 


toTS 


20.20 


Hs.loboo 


nypotnetica] proton FLJ14627 


21.00 


riSJc/lDiO 


COT** lAJAMUta *t n 4. A1 t lO t II i A A t 1 1 

coTs, WeaKly similar to ALU8.HUmAN ALU S 


25.80 




suppressor of varl {S.cerevis)ae} 3-)ike 


23.60 


HS.o0595o 


zinc finger protein 83 (HPF1) 


27,20 


nS. 100(595 


ESTs 


30.20 


Hs.l 79662 


nucleosome assembly protein 1-6ke 1 


20.80 




ESTs 


30.60 


U- A'iAQA 

nS.424o4 


nypotnetlcal protein FU 10618 


28.86 


Hs.i22579 


l^otheticai protein FLJ10461 


38.00 


nS.lo4411 


dlbumin 


22.60 




transmembrane piotease, serine 4 . 


173.60 


Hs.59761 


ESTs, Wealdysimiiarto DAP1.HUMAN DEATH 


27.77 


Hs.61232 


ESTs 


20.80 


nS.i72a7Z 


itypoaietica protein FiJ2u093 


164.20 


Hs.1 10637 


tK)nneoboxA10 


38.00 




gb.7v16a1 1 .si Scares fetal Ttver spleen 


21.60 


Ms.260622 


Dug^eHnouoed ranscrp 1 


49.40 


nS.4rooo 


cSTs, Wealdy similar to KCC1_HUMAN CALCi 


22.40 


HS.93S60 


Homo sapiens mRNA for KIAA1771 protein. 


22.00 




gbza49d07.9l Scares fslal liver spteen 


20.00 


ns.44577 


ESTs 


19.40 


Hs.117i83 


ESTs 


111.20 


HS.S7987 


B-cell Cd/Iymphoma 11B (zjnc finger pro 


33.00 


Hs.272531 


DKFZP58oB0319 protein 


31.00 


Hs.205442 


ESTs, Weakly similar to T34036 liypotheti 


20.20 


Hs.1619 


gbzi59c10.s1 SDares_NhHMPu_81 Homosapi 


25.00 


actiaete-scute complex (DrosophBa) homol 


95.40 


HS.g75o7 


EST 


105.20 


nS.973o7 


ESTs 


38.80 


Ue iitnoi 
ns.izo/9i 


(rf^sHia protem 


41.60 




chloride diannei, cdcium activated, fam 


30.80 


>1- 4 OOTAO 

HS.12B7uB 


ESTs 


19.60 


Hs.l 12488 


6b:aDigni2.8l SOBlageneiung (937210) H 


33.20 


ESTs 


23.17 


nS.ii2D4U 


EST 


23.00 


nS.2700lD 


gb»o97c02^1 NCLCGAPJ>r2 Homo sapiens 


28.80 


ESTs. 


77.60 


ns.271630 


ESTs 


22.20 


Hs,1 11801 


arsenate resistance protein ARS2 


42.20 


Hs.1 02670 


EST 


3Z60 


Hs.1 52684 


ESTs 


21.80 


n&270S94 


FLVCRproteiJ} 


30.40 


H8.140942 


ESTs 


21.20 


Hs.1 8 1244 


ntcjor liistKnmpat&dl^ complex, class 


4Z80 


nS.l 78294 


ESTs 


27.00 


n5.2Z2io 


■ « - ♦ ■ III 
secietogianin 111 


23.80 


H8.75209 


pnteih kinase (cAMP-dependent catalyB 


23.20 


ns.15oo49 


Hono sapiens cONA: FLJ21663 fis, dora C 


21.20 


Hs.295978 


Homo sapiens mRNA full length insert cDN 


48.80 


Hs.249034 


ESTs 


31.00 


Hs.32646 


hypolheficat protein FIJ21901 


21^0 


HS.210B36 


ESTs 


49180 


Hs.296275 


amino add transporter 2 


21.80 


Hs.278956 


hypothetical pratdn FLJ12929 


71.00 




gb:EST28707 Cerebellum D Homo sapiens c 


23.20 


Hs.23850 


ESTs 


23.80 


Hs.17377 


ooionin, acbvfainding protein, 1C 


23.10 


H8.151999 


ESTs 


36.00 


Hs.173933 


nudearfacttvt/A 


20.80 




gb:UI-H-8liBla-a-120^UI.8l NCLC6AP.Su 


46.29 


Hs.293966 


ESTs 


22.80 


Ui. 47C4<97 

nS. 175407 


bSTs, ModerBtely omflar to PC4259 fern 


30.00 


nS.272076 


ESTs 


20l80 


n5.1o4UlB 


ESTs 


25.20 


Hs.l 601 38 


ESTs 


28.00 


Hs.3628 


mitogen-activated protein kinase kinase 


21.20 


Hs.123304 


ESTs 


2aS4 


Hs.1 24347 


ESTs 


28.20 


U_ 4 A JP44 

Hs.1 24511 


ESTs 


19.60 


Hs.286014 


Homo sapiens, clone IMAGE:3867243, mRNA 


20.40 


Ue iMARR 


suivkVai (N iiiuior noUion praiBin iniBrac 


^A An 
34.40 


Hs.10526 


cysteine and giyctne-nch protein 2 


53.80 


Hs.8850 


a disintegrin and metdlopioteinase dome 


23.00 


Hs.109052 


chromosome 14 open riding frame 2 


37.60 


Hs.317584 


ESTs 


28.60 


Hs.270847 


delta-tubutin 


40.80 


Hs.11463 


UMP-CMP kinase 


31.20 


Hs.1 1556 


Homo sapiens cDNA FU12566 lis. done NT 


29.60 


Hs.184780 


ESTs 


72.20 


Hs.296ig8 


chromosonie 12 open leadbio ffaine4 


22.20 
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130149 


AWQ6780S 


Hs.1 72665 


130199 


248579 


1^172028 


130441 


1163630 


H&155637 


130466 


W19744 


Hs.180059 


130462 


AW409701 


Hs.1578 


130817 


M90516 


Hs.1674 


130703 


R77776 


H8.iai03 


130732 


AW8g0487 


Hs.63984 


130867 


NM_001072 


K&284239 


131028 


AI879165 


Hs.2227 


131086 


AL035461 


Hs^l 


131284 


NM.001429 


HS.25Z72 


131775 


AB014S48 


Hs.31921 


13165D 


BE383676 


Hs.334 


131945 


NVL002916 


Hs.35120 


132040 


NM-001196 


Hs^15689 


132084 


NM.0022S7 


Hs^6 


132389 


AA3103g3 


Hs.190044 


132437 


AA1S2106 


Hs.4859 


132550 


AW969253 


Hs.170195 


132617 


AP037335 


HS.S338 


132632 


AUD76916 


HS.S396 


132672 


W27721 


Hs^97 


132742 


AA0254B0 


Hs.292812 


132771 


Y10275 


Hs.56407 


133070 


U92849 


Hs.64311 


133153 


AF070592 


Hsj66170 


133181 


X91682 


H^.66744 


133282 


AA449015 


H5.286145 


133350 


AI499220 


Hs.71573 


133692 


AV652066 


Hs.75113 


133658 


AA319146 


Hs.75428 


133865 


A6011155 


Hs.170290 


134032 


MM 0Q5Q25 


Hs 78589 


134125 


NM_014781 


HsW21 


134158 


U15174 


Hs.79428 


134321 


BE538082 


H&8172 


134367 


AA339449 


HS.8228S 


134570 


U86615 


Hs.172280 


134753 


NM.006482 


Hs.173135 


135002 


AA448542 


Hs.251677 


135029 


H58B18 


Hs.187579 


135047 


AL134197 


HSJ3S97 


13S345 


XS365S 


Hs^71 



RieihylenetetrahydFofblatB dehydrogenasa 
a disintegiin 3nd iRslanoprottinase doina 
pratdn Mnase^ DNAradivstBdt catalylfc 
Homo sapiens cDNA FLJ20653 fis. done KA 
Imiioviral lAP repeakonyning 5 (sur 
ghitaniine-fructo6e-6-|^osphatB transamin 

Esrs ■ 

cadierin 1% H^adtierin {heart) 
UDPglyoosyllransfiBrase 1 famiiy, polype 
CCAAT/enttancer Un£ng protebi (OEBP), 
chnmograntn B (secretDgraidn 1) 
E1A binding proUnpSOO 
KIAA0648 protein 

Rho guanine nudeoUde exchange factor ( 
FepTicaSon factor C (activator 1) 4 (37 
Homo saptons cONA: FLJ22373 lis. done H 
kayopherin alpha 3 (importn dpha 4) 
ESTs 

cyclinLania-6a 

bone morphogeneBc protein 7 (ostaoganic 

carbonic anhydraseXii 

guanine monidiospt^ synthetase 

Odc42 guanine exchange factor (GEF) 9 

ESTs. Wbaldy similar to T33468 hypothetl 

phosphoserine phosphatase 

a disint^rin and metaHoprcrtdnase dome 

HSKM^ protein 

tivst (Dn^ophila) homotog (acrocepbalos 
8RB7 (suppressor of RNA polymerase B, ye 
hypothetical protein FIJ10074 
general tran^pCon factor l!!A 
secretogranin II (chromogranin C) 
discs, large (Drtsophila) homolog 5 
serine (or cysteine) proteinase Inhibito 
KIAA0203 gens product 
BCL2/adenovirU8 E1B 19ld)-)nteracSng pro 
ESTs, Moderately similar to A46010 X4In 
plusphoilbosylgtycinamide formylliansfer 
SWI/SNF related, matrix assodated, acS 
dua^dfidty tyrosinfr(Y>fh08phoryl 
G antigen 7B 

hyifroxysteroid (17-faeta) dehydrogenase 
cydiiHtependent kinase 5. regulatory su 
%3 



29.60 

27^ 

28^ 

20.20 

22.40 

19.60 

19.40 

21.40 

110.00 

2SJiO 

40J0 

24.60 

21.00 

33.40 

60.80 

20.40 

2940 

32.40 

27.40 

75.60 

31.36 

32.40 

23.40 

61.20 

22.33 

2150 

30.00 

2a80 

51.60 

33.00 

82.00 



33.20 
31.60 
30.60 
2140 
49i0 
20.20 
20.80 
37i0 
5140 
31.60 

2aso 



TABI£ 48 shows the accession numbers for (hose primekeys lacking unigeneHTs for Table 4A. For each probeset we have fisted (he gene duster number ban whkh the 
dlgonudeo&des were designed. Gene ctustefs were compOed using sequences derived from Genbank ESTs and mRNAs. These sequence were dusteied based on sequence 
dmtterity using Clustering and Alignment Tods (DoiditeTwist, OaMand CalM^. The Genbank aocesskm numbers for sequences Gomprlsfaig eadi duster are listed in the 
'Aocesslon'cdumn. 

Pkey: Unique Eos probeset identifier number 
CAT number. Gene duster number 
Aooesskxi: Genbank accession numbers 



Pkey CAT 

123619 371681J AA6Q2964AA609200 

126433 127143 1 AA325605 AA099517 N89423 

126672 142696 1 AW450979 AA136653AA1 36656 AW419381AA984358AA492073BE168945AA809054AW238038BE01 1212 BE01 1359 

6E01 1367 BE011388 BE01 1362 BE011215 BE01 1365 BE011363 

106851 322947J AI456623AA639708AA485409R22065AA48S570 

118720 genbanlLN73515 N73515 

120515 genbanlLAA258356 AAS8356 

117099 321871_1 H93699 H97976 H80036 

101447 entret.M21305 M21305 

123130 genbanKM487200 AA487200 
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WO 02/086443 PCTAJS02/12476 
Td)l« SA sluws 660 gei« up^egulaM in squanm oeB caidnoRia or adeoocantooma ^ 

selected fnm 59680 pratosets on IheEosMISinieUxHiiOSGeneGMpa^ GemeapraseiondalefereadipRtoeetaUainedfmthlBainlysisimexpcessedas 
Irtensily a nonnalized iralue reflecbig Ihs telatim b«ei of 

Pliey: Uniqae Eos prabeset idenlilier number 

ExAcoi: Bcempiar Accession number, Genbankaccassion number 

UnlgenelD: IMgene number 

UnigenaTHIe: IMgenegeneMs 

R1: TDIh percentile of AlforsquamuscelcarcinamaandadenocaiGiitomalunghiniorsaniplssdiv^ 

diseased Iwig samples. 

R£ soil percentile dAI adenocarcinoma lung tumor sampte divided by Sie 90th percentile of Al for nn^ 

R3: 80lipemenl)eofAlcquann«scelc8nAianatangbiinorsamplesdlvldedbythe90lhpensfl^ 
R4; 80lfi percerilile of AladeiiujtfidnoRia lung turner samples divided by he 80fl)peioenfle of Alfcrsi|uainoi8 cell 

RS: 'TOIhpercentileafAlfbrsquainousoeDceidmimaaridadenoeaicbicmalungturnorsam^ndnuslhelSlltpet 

diseased lung arid tumor sarnples (Mded by 90lh percenfile of Al tononnl and ohroiMydteated lun^ 

noonal hng, chnmically diseased king and hmior samples 
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ExAccn 




JUUUOa 






IflflMft 
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1UuU<9f 






100071 






iflniiA 

IWI 1** 




Lie n9QR9 


inniftA 




t%s.otast£ 


inniR7 




u. 70400 


IflfllRft 
IUU100 










Ue QQQI n 


100216 




ns. ii3su 






Ue lino 


lUUcOf 




nS.iOw 


















Hs.6793 


luUooU 


W701 # 1 


Ui> 7CO0O 


100372 


MU MJtyM 

NAIUf1479i 


r13.i84938 


1uu4r4 


Uki nnncDfi 
NAILuuOSSg 




100486 


T19006 


Lli> 4^044 

nS.1Do42 




056165 






nruvna 


Us 44 
nS.1i 






Us IMMMfk 






Ub 4CAt\ 

nS.1040 


100576 


X00356 


n$.3705o 


100629 




Lin 04 m4 
n5.21291 


lUUDOi 


UcnZoUUl 


Ue i^tTAQ 


100677 


AAjOdOOD 




100696 




Ufi 4')4eilC 


IUU7U9 


N26S39 


Uf> 4nniieo 
n5.1 00469 


100761 


bk2Do491 


Un '}nC4 40 

nSj(95112 


IDUOOU 


AO004770 


Ks.4756 


100867 


U14622 




100902 


Ml 6029 


Hs^7270 


100908 


AU076916 


Hsj398 


1OU9D0 


J00124 


Hs.1 17729 


1UiU4o 


JU0014 




101061 


NML0D0175 


ns.16{n32 








101124 


L10343 


Hs.112341 


101175 


U82671 


Hs.36980 


101181 


BE262621 


Hs.73798 


101204 


L24203 


Hs.a2237 


101210 


L29301 


Hs.2353 


101216 


AA284166 


Hs.84113 


101226 


AA333387 


Hs^16 


101233 


AL135173 


Hs.878 


101273 


Z11933 


Hs.182505 


101342 


U52112 


HS.18201B 


101346 


AI738616 


Hs,77348 


101369 


NM-000892 


Hs.1901 


101396 


BE267931 


Hs.78996 


101431 


BE185289 


Hs.1076 


101448 


NM.000424 


Hs.195850 


101462 


AL035668 


Hs.73853 


101466 


BE262660 


Hs.170197 


101484 


AA0534fi6 


Hs^0315 


101502 


M2695a 




101505 


AA307680 


Hs.75692 


101526 


NMJ02197 


Hs.154721 


101535 


X57152 


H5.g9853 


101577 


M34353 


Hs.1041 


101649 


AW959908 


Hs.1690 


101663 


NKL003528 


Ha.2178 


101664 


AA436969 


Hs.121017 


101669 


L24498 


Hs.80409 



Unlgene Title 

AFFXcontrakGAPDH 
AFFXcontrakGAPDH 
AFFX control: GAPDH 
Human GABAa receptor alpha-3 subunft 
thymidylale synthetase 
KiAAOl 01 gene product 
ddoM} fBductase famHy 1, rnember C3 
mtntchromosoms msdntenanos dcfident (S. 
phosphofructokinase, plaleid 
proteasome (proscme, macropain) subunRi 
E2F transcriptton factor 3 
chaperDROi containing TCP! , subunH 5 (e 
piotein disUHide t5oni8ras648lalBd prat 
mhicluomosonie maintenance defidanl (S. 
platelet-acGvaiing factor ace^lliydida 
uridine monoptiosphate Idnase 
KiAA0175 gene product 
amylase, alpha 2A; pancreatic 
RAN, menter RAS oncogene family 
non^nelastafic ceils 2, protsin (NM23B} 
cardnoembryonic antigen^elatBd cell ad 
ptolactln-induced protein 
collagen, type VII, ^pha 1 (epidemtolys 
caldtonin/caldtonin-related polypeptid 
mitogen-activ^ protein Idnase Idnase 
Homo sapiens ribosomai protein L39 mRNA, 
dncrittan domain contafadn^ 1 
general transcrijiGon (activ If A, 1 (37k 
myeldd/lymphoid or nnixed^neage teuinm 
KlAA0618gene product 
flap structure^ecffic endonudease 1 
glKHiiman transletoias&ae pratBin gene 
ret piotCMjncogene (ntAtple endocrine n 
guarine monphosphate synthetase 
lereiin 14 (epidermolysis bullosa simple 
gbdHuman proliferating ceD nuclear anil 
glucose phosphate isomeiase 
potasshim voUage^ated channel. Shaiwe 
protease Inhibitor 3, skiivdeifved (SKAL 
melanoma antigen, family A, 2 
macrophage migration Inhibitory factor ( 
alaxia4elanglectasla group D-assocIated 
opbid receptor, mu 1 
cyciin^lependent Idnase inhibitor 3 (CDK 
chaperonin containing TCP1, sUbunH 6A ( 
softitol dehydrogenase 
POU domain, dass 3, transcripibn facto 
interieuldtv-1 receptor-associatBd Idnase 
hydroxyprostaglandin dehydrogenase 15-(N 
kaa3(rein B, plasma (Retdier fector) 1 
proiiferating cell nuclear antigen 
small pmllne-rich protebi IB (comffin) 
keratin 5 {epiderrriolysis bulkisa simplex 
bone moiphogenetic protein 2 
ghitamiooxaioacetic transaminase 2. mil 
InterferonHnduced protein with tetratri 
Ob:Human parathyroid honnona<rdaledp{0 ■ 



aconitase 1, soluble 
faiTlliarin 

\KOS avian \}B2 sarcoma virus oncogene h 
heparin^ndlng growth factor bindtng pr 
H2B histone family, member Q 
H2A histone family, member A 
granrth arrest and DNA^tamage^dudfate, 



R1 R2 R3 



R4 



8.00 



184 
3.33 



Z55 



5.07 



3.10 
3.85 



7.20 



&60 



257 

ai2 

aso 

4.08 
2.53 

&S0 

124 
&31 

msd 

4.02 



54.00 
5J9 
7.00 



7i0 
10.20 

aoo 



12.91 



24.80 



6.40 



15.65 



14.20 

9.30 
20.60 



iQJOO 



R5 

a76 
5.77 
175 

171 



4.52 
&49 
&67 

&66 
181 
4.50 

4.82 
179 

5.49 
4.17 



&16 

4.69 
4.19 



21.89 
1Z80 



38.80 
12.00 



9.09 



7.90 
4.45 

4.17 

7.90 

4.01 

4.46 
4.65 



7.60 
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WO 02/086443 

101695 M69136 Hs.135626 
101724 L11690 
101748 NM_001944 
101759 M80244 

NM_002432 



101771 

101804 M86699 

101809 M86849 

101833 AU076442 

101842 M93221 

101851 BE2G0964 

I02D02 KM.002484 

M.134223 

102072 U09410 

102083 T35901 

02111 L36196 

02123 NM-001609 

02154 U17760 

02193 AL036335 

102217 AAa29978 

102224 NM.Q02810 

102234 AW1633gO 

02251 NM-00439S 

02305 AL043202 

102330 BE298083 

102340 U37055 

102348 U3^19 

102368 U39B17 

102394 NM.00381& 

102404 NM.005429 

102537 U57094 

I02S81 AU077228 

102605 AI43512B 

102610 U65011 

102623 AW249285 

102642 AA205847 

102654 AV649989 

102659 BE245169 

102669 U71207 

102672 U72066 

102687 NI11.Q07019 

102896 BE540274 

102758 082 321 

02781 BE258778 

102784 U8565B 

102824 U90916 

02829 NM.006183 

02888 A1346201 

IQ2B92 BE440042 

102913 NM.002275 

IQ2935 BE561850 

102951 X15216 

IQ2S83 BE387202 

103023 AW50047D 

103036 M13509 

103036 AA926960 

103060 N1UL005940 

103099 AI693251 

103119 XB3629 

103168 X53463 

03185 NIi/L006825 

103192 M22440 

103223 6E27K07 

103242 X76342 

103316 XB3301 

103375 NM.005982 

03376 ALQ36166 

103385 NA1.007069 

X94453 

103404 6E3947B4 

103430 BE5&4090 

103446 X98834 

103476 Y07701 

103477 AJ011812 

103478 BE514g82 
03515 Y10275 
03558 6E616547 
103580 AA328046 
03587 BE270266 
03594 AI368680 
03636 KM_006235 



Hs.620 
Hs.1925 
Hs.184601 
H3.153837 
Hs.169840 
Hs.323733 
Hs.117938 
Hs.75182 
Hs.82045 



Hs.306096 

Hs.78743 

HS75117 

Hs.81884 

H8.1594 

Hs.75517 

Hs.313 

Hs.301613 

Hs.148495 

Hs.278554 

Hs.41706 

Hs.90073 

Hs.77254 

Hs^8657 

H5.87539 

Hs.36820 

Hs.2442 

Hs.79141 

Hs^77 

Hs.77256 

HS.1813G9 

Hs.30743 

Hs.37110 

Hs^ie 

Hs^4385 

Hs^ieio 

HSw29279 
Hs^7 
Hs.93002 
H&239 

H8.108809 

Hs.61796 

Hs.82845 

H8.80962 

Hs.76118 

Hs.83326 

Hs.e0342 

HS.80S06 

K&2969 

Hs.118638 

Hs.117950 

Hs.83169 

Hs.334883 

Hs.155324 

HS.8248 

Hs.2877 

Hs^ 

Hs.74368 

Hs.170009 

Hs.1708 

Hs.389 

Hs.324728 

HsM16 

Hs^23378 

Hs.37169 

Hs.114366 

Hs.78596 

Hs^716 

Hs.79971 

Hs.293007 

Hs.119018 

Hs.38991 

Hs^07 

Hs.2785 

Hs.46405 

Hs.82126 

Hs.616 

HS^7 



I03S41 AA314821 HsJ3dm 

103847 AF219946 Hs.t02237 

103913 AW967500 Hs.133543 

104094 AM18187 Hs^lS 



ia57 



1280 



12^ 



12.00 



&60 



19.20 



22.00 



a33 



14.00 



i2M 



chymasel, mast cell 4.79 
bullous pemphigoid antigen 1 (230/24akDI 15.21 
dssmo^dn 3 {pemphigus vulgaris ant^en 5S.50 
solute csrrier family 7 (cationtc amino 
myeloid oeQ nuclear differenliafion ani 
TTK protein idnase 4.50 
gapjundbn protein, beta 2, 26)(D (oonn 140.ra 
oojlqgen.^peXVll,alp)ia1 Z56 
mannose receptor, C ^ 1 
midldne (neurite grow^t^moting factor 
nucleotide lilnding protein 1 (E.coli Min 7.80 
aido^ reductase family 1, member CI 

zinc finger protein 131 (clone pHZ-10) 7.40 
interieuldn enliancer bindingfactor 2, 4 
sulfotransferase fanfly, cytosoOc, 2A, 
centFomera protsin A {17kD) 
taminin, betas (flioein (12SkO).l(aIinin 
sacreted phosptopntoin 1 {osteoponSi^ 
JTVIgene 

proteasome (prosome, macropain) 268 subu 
heterodiromatin^ke protein 1 
OEAIVH (Asp^l^Al&Asp/His) box poiypep 
chromosome segregaBon 1 (yeast tionoiog) 
chromobox homdog 1 (Drosopttila HP1 beta 
macroptiage stimulafing 1 (tiepatocyte gro 
akieliytle deiiydrogenasa 3 family, member 
Bloom syrujmme 

a disinte^n and metaHopioleinase doma 
vascular endothelial growth factor C 
RAB27A. member RAS oncogene family 
enttancerof zesta (DrosophBa) homdog 2 
tdrfqutti fusbm d^radaBon VGke 
prsferanfiaHly exprassod anfigan in meia 
melanoma antigen, tamBy A, 9 
G protein-ooupted receptor 
Human hbc647 mRNA sequsiKS 
CU6 triplet repeat. RNA^wuDng protein 
eyas atsenl (OrDSQphila) homolog 2 
rsSnotdastoma^nding piotdn 6 
uUquitin carrier protein E^C 
fbrithaadboxliAl 

gtKHomo sapbns done 14iB mRNA sequenc 
diaperorft) containing TCP1, subunit 7 (e 
transcription factor AP-2 gamma (acflvat 
Homo sapiens cDhtA: FU21930 lis, done H 

tibiquffin carboKyMennlnatedBrese LI 
maffixmaiauoproisfliasa 4 (Buumaiyain 
karalin 15 

smafi nuclear ribonudeoprotein potypept 
v«ki avian sarcoma viral oncogene homd 
noiHnstastatic ceOs 1. protain (NM23A) 
nrudtilimcttonai polypeptide simiiar to S 
matrix motalioprotdnase 1 (interstitla] 
COC28 protein kinase 1 
matrbc metaHoproteinase 1 1 (atromdysin 
NADH delvdrogenase (ubiquinone) Fe^ 
cadharin 3, lype I, P-cadherin (plasenb 
glutathione peraxlda^ 2 (gasbtiintestin 
transmerrtrane proton (631(0), endoplasnn 
transforming growth tectoTi alirtia 
chaperanfn containing TCP1. subunR 3 fg 

alcohd dehydrogenase 7 (dass IV), mu o 100.00 

SMA5 9.80 

sine oculis homeobox (Drosophtia) homdo 0.71 

coated vesicle membrane protein 14.00 

6imilartoFatHREV107 11.00 

pyiTdine-&«arboxytate synthetase (glut Z93 

proteasome (prosome, macropdn) aubunit 

translocase of inner mitochondrial memls' 

8al(0iosophiiaHkB2 21.40 
amkiopepfldase puromydn sensitive 13.00 
franscriplionfectorNRF 6.40 
S100 caldunvtlmSng protein A2 
phosphoserine phosphatase 

pdymerese (RNA) II (Dl^<firected) polyp 
5T4 onoofatai troprftoblast glyooprotdn 
SI^ (sex detennlnii^ region Y]hbox 2 
POU domain, dass 2. associating factor 
gb:Homo sapiens fuB length insert cDI^ 
hypoMcal protein RJ23468 aOO 
tubby supar-famOyproldn 10.40 
ESTS 15.60 
EST8 &6D 



6.20 
Z62 
&85 



450 



a87 
15.91 



77.50 
1Z50 



asD 
aso 



aoo 



4.64 
Z93 



101 
27.90 



4.06 
a07 



&02 
10.50 

a4i 

76.80 

asi 
aso 
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4.35 

ai2 



6.18 
4.49 
5.80 

5.16 
4.17 



4.57 

as8 



14.40 
6.70 



11.40 



aoo 



7.40 



9.24 
5.54 

3.78 
426 



5.S0 



7.26 



a79 

4.27 
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4.70 



ai5 

a98 



3.84 



4.48 



112 



5 
10 
15 
20 
25 
30 
35 
40 
45 
50 

55 

60 

65 

70 

75 

80 

85 



104150 


AU22044 


H5.331633 


104257 


BE550821 


Hs.9222 


104261 


AW248364 


Hs.5409 


104331 


AB040450 


Hs.275862 


104415 


BE410992 


HS.25B730 


104558 


R56678 


Hs.88959 


104590 


AW373062 


Hs.83623 


104658 


AA360954 


HSJ7268 


104660 


BE298665 


Hs.14846 


104689 


AA4204S0 


H&292911 


104754 


A)206234 


Hs.155924 


104758 


BE560269 


Hs.7010 


104971 


BE311926 


Hs.15830 


105011 


BE091926 


Hs.16244 


105012 


AF0981S8 


Hs^ 


10S026 


AA809465 


H^.124219 


105076 


AI598252 


Hs.37810 


105132 


AA148164 


Hs.247280 


105143 


AI358836 


Hs.24808 


105158 


AW976357 


H5.234545 


105175 


AA305364 


HS.2S740 


105200 


AA328102 


H5.24641 


105264 


AA227934 




105298 


BE387790 


H$^36g 


105409 


AW505076 


Hs^lSSS 


105460 


AW296078 


Hs.271721 


105667 


AA767526 


Hs^030 


105743 


BE246502 


Hs.9598 


105782 


H09748 


HS579S7 


105848 


AW954064 


Hs^4951 


105891 


U55984 


Hs^g088 


106019 


AF221993 


Hs.46743 


106059 


BE566623 


Hs.29899 


106073 


AL157441 


Hs.17834 


106126 


AA57eg53 


Hs.22972 


106159 


AKD01301 


Hs.3487 


106220 


061329 


Ks.32196 


106260 


AI097144 


Hs.5250 


106300 


Y10043 


Hs.19114 


106307 


AA436174 


H&jsnsi 


108318 


AA02S610 


Hs.9605 


106341 


AF191020 


H8.S243 


106440 


AA449563 


Hs.151393 


106481 


081594 


Hs.17279 


106586 


AA243837 


Hs^7 


106605 


AW772298 


Hs^1103 


106654 


AW075485 


H5.288049 


106765 


Y15227 


Hs.20149 


106813 


C05766 


Hs.181022 


106895 


AK001826 


Ha^5 


106913 


AI219348 


H5.86178 


106919 


AW043637 


H551766 


107054 


AJ076459 


Hs.15978 


107059 


BE614410 


Hs.23044 


107098 


At823593 


Hs^88 


107104 


AU076640 


Hs.15243 


107129 


AC004770 


Hs.4756 


107198 


AV557225 


H5.9846 


107203 


D20426 


Hs.41639 


107217 


AL080235 


Hs.35861 


107284 


NM.005629 


Hs.187958 


107318 


T74445 


Hs.5957 


107516 


X57152 


Hs.99853 


107529 


BE515D65 


Hs^6585 


107728 


AA019551 


Hs^151 


107851 


AA022g53 


Ha.61172 


107901 


L42612 


Hs.335952 


107922 


6E153855 


Hs,61460 


107932 


AW392555 


Hs.16878 


108D15 


AW298357 


H8.49927 


108056 


AA043675 


H&62633 


-108075 


AI867370 


Hs.l3970g 


108187 


BE245374 


H5,27842 


108296 


N31256 


Hs.161623 



106305 AA071391 

108393 AA075211 

108480 AL1330g2 

108554 AA084948 

108573 AA086005 

108584 AA088326 

108597 AKQ00292 

108695 AB029000 

108699 AA121514 

108700 AA121518 
108780 AU076442 



Hs.68055 



Hs.120905 
H8.278732 
Hs.70823 
Hs.70832 
Hs.193540 
H&1 17938 



hypothetical prtjtein DKFZp566N034 
estrogen receptor tending site associate 
RNA polymerase I subunit 
oik mMtor p21 binding protein ' 
hem&fegulated initialion factor 2-^lpha 
hypothetical protein MGC4816 
nuclear receptorsuWamHy 1, group I, m 
Homo sapiens cDNA: FU21933 lis, done H 
Homo sapiens mRNA; cONA OKFZp564O016 (fr 
ESTs. Highly similar to S60712 band-&pr 
cAMP respond element modulator 
NPD002 protein 
hypothetical protein FU 12691 
mitotic spindle cdled^ii related prot 
chromosome 20 open reading frame 1 
hypothetical protein FU12g34 
hypothetical proteb MGC14833 
HBV associated fector 
ESTs. W^aMy similar Co 136022 hypotheli 
hypotheticai protein NUF2R 
ER01(S.cerevisiaeHike 
cytosteleton associated protein 2 
gb2rS7e08.s1 Soares.NhHMPu.S1 Homosapi 
hypothetical protein FU20287 
DiGBorge syndrome ctfScsH re^on gene 8 
Homo sapiens, clone IMAGE:4179986, mRNA, 
paired box gene 5 (B-ceD lineage specif 
soma domain, immunogiobulin domain (Ig), 
B^ll OUIymphoma 11B (zinc finger pro 
ESTs 

heat shodc 90kO protein 1 , a^ha 
McKusidi-Kaufman syndrome 
ESTs. Wealdy similar to G02075 tranacrip 
downstream neighbor of SON 
hypolheQcal protein RJ13352 
hypothetical protein FU10439 
mitochondrial fQ}osomal protein iM 
ESTs. WeaWy similar to ALULHUMAN ALU 8 
high-mobBHy group (nonhistone chromoso 
ESTs, Weakly simlter to putative p150 [ 
cteavage and polyadenyiation specific fa 
hypothetical protein, estradlot-induoed 
glutamatfrq^teine Ilgase, catalytie sub 
tyrosylprotein sulfbtransferase 1 
ESTs 

Homo sapiens mRNA; cDNA OKF2p564B076 (fr 
phosphoserine anvnotransfierase 
deleted in lymphocyGcleuiemia, 1 
CGl-07 protein 

hypothetical protein FU1 1289 

Mjriiase phosphoprotein 9 

ESTs, Weakly similar to AUU5_HUMAN ALU S 

KIAA1272 protein 

RA051 (S. cerevtstee) homolog (E coli Re 
ESTs 

nucleolarprotein1(120kD) 
flap structure>6pecffic endonudease 1 
K1AA1040 protein 
programmed cell death 2 
DKFZP586E1621 protein 
sototec a fftef f amByS (neurobansmlfia 
Homo sapiens done 2441 6 mRNA sequence 
fibrElann 

nudedar protein (KKBD repaaO 

Homo sapiens, done IMAGE:360383Q, mRNA, 

EST 

keratin 6B 

Ig superfamily receptor LNiR 
hypothefica) protein FU21620 
pn)tain kinase NYI>SP15 
ESTs 

hypothettoal protein FU12572 
hypothetk:aIprotelnRJ11210 
ESTs 

gb2m61e06.r1 Stratagene fibroblast (937 
gbzmSeaOSj-l Stratagene ovarian cancer 
hypothetical protein DKFZp434l0428 
gb2n13b09^1 Stratagene hNT neuron (937 
gb2l84o04^1 Stratagene ooton (937204) 
Homo sapiens cDNA FU11448 lis, done HE 
hypotheticai protein FLI2028S 
K1AA1077 protein 
ESTs 

ESTs. Moderately similar to 210g260A B c 
odtegen, type XVII, alpha 1 
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11.00 



16.00 
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4.47 



5.01 
3.89 



6.60 



4.75 



28:00 
3.00 



4.71 



Z60 



aso 

Z71 



3.40 
2.88 
7.60 



6.56 



19.20 
7.60 



7.80 



7.60 
23.40 



13i0 



13.80 



11.40 
6.00 



10X0 
9.20 



4.14 



3.95 
6.04 

5.02 

5.04 
7.25 



10.84 
45.60 



34.80 
24.80 



171 



im 



4.27 



7.05 



4.33 
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aoo 



7.00 



&40 



3.00 
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108810 


AW295647 


Hs.71331 


hypothetical protein MGC5350 


8.50 


108816 


AA130884 


HsJ!7050l 


ESTs, IIMRdelysimilarto AUJ2J1UMAN 




108857 


AK001468 


H8.62180 


enilln (Droso|Mla Scraps hfimolog}, ad 


4.00 


108860 


AA133334 


H&.129911 


ESTs 


&09 


108937 


ALOSOIO? 


Hs.24341 


transcriptional co-acfivator with PDZ-bl 


3.00 


109010 


NiyL007240 
BE389387 


Hs.44229 


dual spedfidtx pbospJiatase 12 


m 


109121 


Hs.49767 


KADH dehydregenase (ubiquinone) Fe-S pro 




109166 


AA2ig691 


H5.73625 


RAB6 intsradino. Idnesin^ (labldnes 


\m 


109227 


AA766998 


Hs.85674 


Human DNAaequencefPom done RP11-16L21 




109415 


UB0736 


Hs.110826 


tiimjcledide repeatcaiitalnlno 9 




109416 


AI866946 


Hs.161707 


ESTs 




109454 


AA232255 


H8.295232 


ESTs, Moderately 8imiiartoA46010X4i 




109S02 


AW967069 


H&.211556 


hypotheScai protein tk/IG06467 




109543 


AA564994 


Hs.222651 


ESTs 




109648 


HI 7800 


Hs.7154 


ESTs 




109680 


AB037734 


Hs.4993 


KIAA1313 protein 




109700 


FD9609 




gl}.'HSC33HG32 nonnalized inbnt bain cON 




109704 


A17438B0 


Hs.12876 


ESTs 




109792 


R4g625 




gb:yg61f03^1 Soaresin^tlHain IMB H 




109981 


BE546208 


Hs.26090 


hypotheticai protein FU20272 


AM 


109998 


AL042201 


Hs.21273 


tiSDiscription factor NYI>6p10 




110039 


H11936 


H8.21907 


histone acetyttranslierase 




110156 


AAS81322 


Hs.4213 


hypothefica) protein MGC16207 




110500 


AA307723 


H5.36962 


ESTs 


4.50 


110551 


AW450381 


Hs.14523 


ESTs 




110561 


AA379597 


Hs.5ig9 


HSPC150 protein similar to ubiquiiin-con 


ao8 


110654 


86612992 


Hs.27931 


hypothetical protein aj10607 dmilarto 




110886 


AW2749g2 


Hs.72249 


threoTOZ corWng protein stniflar to 




110916 


8E17B102 


Hs.24349 


ESTs 




111003 


N52980 


HS.B3765 


dihydrofolate redudase 




111337 


AA837396 


H&26392S 


USI-interading protein NUDE1, rat homo 


Z54 


111434 


R01608 


88.142738 


ESTs 




111439 


At476429 


Hs.19238 


ESTs 




111540 


U82670 


Hs.9786 


snc finger protein 275 




111597 


R1149g 


Hs.189716 


ESTs 




111895 


T60581 


Hs.12723 


Homo sapiens done 25153 mRNA sequence 




111929 


AF0^2Q8 


H5.112380 


prominin (mou8e)-lilBB 1 




112054 


R43530 




gbYc65o01s1 Soares infant tvain 1NIB H 




112210 


R49645 


Hs.7004 


ESTs 




112244 


ABO29000 


Hs.70823 


KiAA1077 protein 


2.99 


112382 


R5d904 




gb.yh07s1Zs1 Scares intent brain 1NIB H 




112392 


R60763 


Hs.193274 


ESTs, Moderately similar to 157588 HSrd 




112442 


AA280174 


Hs.285681 


WiHiams-Beuren ^ndrome diromosome le^ 


aoo 


112539 


R70318 


Hs.339730 


ESTs 




112772 


AI992283 


Hs.35437 


ESTs, Moderately simflar to 138026 MLN 6 




112889 


BE261750 


Hs.4747 


dysicerdtesis conganite It c^^slterto 




112935 


R71449 


H&268760 


ESTs 


173 


112970 


AA594010 


HS.6932 


Homo sapiens dons 23609 mRNA sequence 




112973 


AB033023 


Hs.318127 


hypothetical protein FU10201 


llfO 


112992 


AL1 57425 


Hs.133315 


Homo sapiens mRNA; cDNA DKFZp761J 1 324 (f 




113083 


W1S573 


HS.S027 


ESTs, VMdy similar to A47582 ^ gr 


15.00 


113073 


N39342 


Hs.103042 


III, null ill III II <iii«i«^|-»liiliwl — Mitntji 4D 

flttcroiUiniiBwSOCBiBO proieui is 




113078 


T40444 


Hs.1 18354 


CAT56prddn 




113238 


R45467 


Hs.189813 


ESTs 




113591 


T9ie61 


Hs.200597 


KIAA0563 gene produd 




113702 


T97307 




gbcye53h85^1 Soares fetal fiver spleen 


25.00 


113844 


AI369275 


Hs.243010 


Hornos^nscDNAFU14445lis.doi»HE 




113984 


Rd6696 


Hs.35598 


ESTs 




114073 


R44953 


Hs.22908 


Homo saptens mRNA; cDNA DKFZp434J1027 (f 




114162 


AF155661 


Hs.22265 


pyruvate dehydrogenase phosphdase 


3.42 


114208 


AL049466 


Hs.7858 


ESTs 




114251 


HI 5261 


Hs.21948 


ESTs 




114285 


R44338 


Hs.22974 


ESTs 




114313 


H18456 


Hs.27946 


ESTs 




114339 


AA782845 


Hs.22790 


ESTs 




114407 


BE539976 


Hs.103305 


Homo sapiens mRNA; cONA DKFZp434B042S (f 




114560 


AI452469 


HS.16S221 


ESTs 




114699 


AA127366 




gb3n60d09.r1 Slrategene (ung caroinofna 




114767 


AI85g865 


Hs.154443 


mtntchromosome maintenanoe detiebnt (8 


3.21 


114793 


AA158245 




gb:zo76c033l StFEd^ene pancreas (93720 




114833 


A1417215 


Hs.87159 


hypothetical protein fLj12577 
chaperonin containing TCP1 , subunft 6A ( 




115047 


BE270930 


Hs.82916 




115060 


AF052693 


Hs.1 98249 


gap junction protein, bete 5 (oonnexin 3 




115097 


AA256213 


H3,72010 


ESTs 




115113 


AA25646D 




gbzr81a04.s1 Soares J^hHMPu.SI Homosapi 




115123 


AA256641 


Hs.236894 


ESTs. Highly similar to S02392 alpha-2-m 




115134 


AW968073 


Hs.194331 


ESTs, Highly siniar to AS5713 Inodbl 




115291 


BE545072 


Hs.122579 


hypothetical protein FLJ10461 


25.00 


115347 


AA3S6792 


Hs.334824 


t^pothetrcal protein HJ14625 




115414 


AA662240 


Hs.283099 


AF15q14 protein 


3.25 


115522 


aE614387 


H5.333893 


0^ target JP01 


a68 


115536 


AK00146B 


Hs.62180 


anBlin (Drosoptdia Scraps hcmotog), ad 


10.50 


115566 


Ai142336 


H8.43977 


Human DNA sequence from done RP11-196N1 




115645 


AI207410 


HS.G9280 


Homo saptens. done IMA6E:3636299, mRNA. 


4.17 


115648 


AW016811 


Hs.234478 


Homo saptens cDNA: FU22648 lis. done H 
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7.20 
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6.00 



7.00 
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9.80 
10.40 

9^ 

usr 

10.20 
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1Z00 



41.20 
9.40 

13.91 



33.20 
ia20 

laoo 



9.80 



11.40 



35.40 
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1Z40 
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115652 


BE0935B9 


Hs 38178 


115697 


D31382 


HS.6332S 


115793 


M424883 


H&70333 


115816 


BE04291S 


Ksi87S86 


11^92 


AA291377 


Hsi0831 


115906 


A1767756 

•Til Vf f ou 


Hs.82302 


115909 


AW872527 


Hs 59761 


11S965 


AA001732 


Hs.173233 


115978 


AU035864 


Hs.69517 


115985 


AA447709 


Hs.268115 


116090 


AI591147 


Hs!61232 


116096 


AA682382 


H8>59982 


116127 


AF126743 


Hs^9884 


116157 


BE439838 


Hs.44298 


116190 




Hs.67776 


116276 




Hs.47504 


116335 


AKDD11QQ 


Hs!41690 


116496 


AW45Qfi34 


H&21433 


116503 


Al92S31fi 


Hs.212617 


116674 


f\t f uuv 1 ^ 


Hs.92127 


116929 


AA586922 


Hs.60475 


116973 


AI7Q2054 


H5!l669B2 


116993 


AI417023 


Hs!40478 


117079 


H92325 




117317 


AI2B3517 


Hs.43322 


117326 


N23629 


Hsi41420 


117396 


W20128 


Hs^6039 


117412 


N32^6 


Hs.42645 


117519 


N325^ 


Hs!l46286 


117693 


AW1 70019 


H8112110 


117721 


N46100 


HsJ3939 


117881 


AF161470 


Hs260622 


117903 


AA7fiR9R3 


Hs.47111 


117992 




Hs 172089 


116013 


AI67d12fi 


Hs94Q31 


118017 


AI813444 


Hs.42197 


118186 


N22886 


H&42380 


118325 


AI868Q65 


W& 166184 


118367 


N642e9 


Hs.48946 


118368 


N64339 


Hs.48956 


118472 


AL157545 


Hs!42179 


118709 


AA2329ro 


H&293774 


119025 


RPflOTTfiO 

BKWVWf Vw 


HsJ5209 


119027 


AFD86161 


Hs.1 14611 


119052 


R10889 




1 IVtO'l 




H&46743 


119166 


AI979147 




119243 


T12603 




119490 






119499 


AIQIflOng 
r\lV lOvvv 




119899 


W455S2 




119780 




Hs.191381 


119845 


W79123 


HS.S8561 


119941 


AAG99485 


Hs.S88g6 


11^94 


AA642402 


Hs.59142 


120102 


W67353 


Hs.170218 


120104 


AK000123 


Hsil 80479 


120294 




Hs.153881 


1204B6 


AW368377 


Hs.1 37569 


120599 


AAS04446 


Hs.1 04463 


120699 


At683243 


Hfi.972S8 


120715 










tteQ(iR7fl 

rKit9QOf u 




AA826434 


Hi; iris 






UeOTniQ 






UeOTCOT 
. rK>.9rtlor 






n8a£f IOmI 


171191 


AA399371 


na. I09U99 




AAin971^ 


Ue Q7R79 


121369 


AW4fifl737 


Hfi.19R7Q1 

Dd. lAur 9 1 




AA448103 


Hs.1fl7QSB 


121476 


AA412311 


HS.97W3 


121509 


AA868939 


Hs 97888 


121553 


AA412488 




121753 






121838 


AA425680 


Hs.98441' 


121857 


BE387162 


Hs.2808S8 


121991 


AA43n58 


Hs.g8649 


122089 


AW016543 


Hs.98682 


122105 


AW241665 


Hs.98e99 


122163 


AA435702 


HSJ8829 


122318 


AA429743 




122335 


AA4432S8 


HS241551 


122338 


AA443311 


Hs.98998 


122414 


AI313473 


HSJ99087 



liypothetical protein aJ23468 asi 
transmembrane protease, serine 4 62.14 
hypothetical protein MGC10753 
Homo sapiens cDNA FLJ13575 lis, clone PL 

ESTs 27.40 

Homo sapiens cDNA FU14814 6s. done NT Z53 

ESrs.V\toAtysimllartDDAP1.HUMAN DEATH 11.82 

hypolh^ protein FU10970 

COMA for dffliefenlialiy expressed C016 g 

ESTs.WeaMy8tmlartoT06599probal)le 3.00 

ESTs R17 

^Ts 8.20 

DNAJdomain^ontelnlng 10iO 

mitochondiial ritiosonial protein SI 7 

ESTs. Weakly simiiarlo T22341 hypothefi 

exonudease 1 9.S0 

desmocoHIn 3 3.67 

hypothefical protein DKFZp547J036 7.00 

ESTs 

ESTs 32.00 
polymerase fjm^ II (ONAdirected) polyp 7.60 
phosphatidylinosilolg)ycan.classF aso 
ESTs 

gb:ys85f05.s1 Scares retina N2b4HR Homo 
ESTs 

Homo sapiens mRNA for KiAA1756 protein, 

ESTs 

ESTs 

kinesin family memtnr 13A 
mitod)Ofxlrial ribosomal pn4ein 
EST 

bulyratB4nduced transcript 1 Z71 
ESTs 

Homo sapiens mRNA; cONAOKFZp586l2022(r 

ESTs 

ESTs 8.82 
ESTs 7.00 
unsrraciinz 

EST 6.14 
gap junction protein, beta 6 (oonnexln 3 3.14 
bromodomain and PHD finger cont^ing, 3 12.40 
ESTs 

Homo sapiens mRNA; cDNADKFZp434K0514(f 4.50 
hypothetical protein FU11608 3.22 
gb.7f38d02.s1 Scares fetal fiver spleen 9.60 
McKusick-Kauiman syndrome 6.60 
hypothetical protelnRJ22593 
gb:CHR90123 Chromosome 9 exon II Homo sa 
ESTs, Moderately similar to B34087 hypot 

ESTs 14i0 
gb2c26d03.s1 Soares_senescenLfibiobias 12.60 
hypothetical protein 17.00 
G protein-coupled receptor 87 1 3Ji0 

ESTs 

ESTs 7.73 
KIAA0251 protein 
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BE513171 


H5.79086 


134107 


NM.0OS629 


Hs.l87g58 


134112 


AW449809 


Hs.79150 


134158 


U15174 


HS.7942B 


134160 


T981S2 


Ks.79432 


134168 


AA3989Q6 


Ks.181634 


134185 


AA285136 


H&301914 


134201 


L3S035 


HS.79B86 


134272 


X76040 ' 


H&27e614 


134276 


BH783936 


Hs.80976 


134353 


A1138201 


Hs.82120 


134367 


AA339449 


Hs^2285 


134380 


AU077143 


H5.17g565 


134423 


H53497 


H5.83006 


134469 


AA279661 


Hs.83753 


134470 


X54942 


Hs^758 


134498 


AW246273 


Hs£4131 


134502 


BE148534 


Hs^168 


134510 


NM.002757 


H&2S0870 


134548 


N95406 


Hs.333485 


134654 


AK001741 


Hs.8739 



a60 

27.40 



6.60 



e(DNA}(lalpha(170kO) 19.00 
ESTs a48 
ESTs a40 
rep&calbn factor C (activator t) 4 (37 56.00 
ART-4protdR 

ESTs a03 
melanoma antigen, femOy A. 3 9.60 
Homo sapiens cDNA: aJ22373 fis, done H a30 
cysteine knot superMly 1, BMP antagon 21.00 
lymptiold-festricted RKvnbrano proMn 840 
desmoQldn 2 

pfocoll£^8fvlys!ne. 2<oxoglutarate M\o Z70 
fibroblast activation protein, alptia 2.71 
Homo sapiens clone TCCC(A00427mRNAsequ 3.83 
eootropic vim) integration site 2A 
hypothe&al protein OKFZp434K243$ 9.50 
iiypottielical protein FU10683 4.60 
DnaJ (Hsp40) liomolog, subf^ B, meiribe 
OKFZP434F091 protein 
ESTs. tyiodera^ similar to ALU8.KUMAN A 
SMC4 (structural maintenance of chromoso 
piotdn reQulBtOT of cytoldna^ 1 4.38 
H2AhistonefMy. member P 7.00 
bone morphogenetic protein 7 (osteogenic 2.64 
tiiiopurine S^ethyltransferase 
liypotheiicsl protein FU20624 
carbonic anlvdrase XII 4.95 
DMA segment on chromosoma X (unique) 992 8.20 
laminto, gamma 2 (nican (lOOkD), YsM 4.38 
guanine nudeotida binding protein (G pr 
serine (or cysteine) proteinase inhibito 4.60 
phosphoserne pixsfitoiase 3>71 
SAC2 (suppressor of acfin muta&ns 2. 
eukaryotic transition MGaOon factor 
tensin 

gendnin 3.09 
ESTa,VtealdysMiartoYAE6.YEASTHYPOT 
CGMOprotein asO 
transcription factor AP-2 alpha (actlvaf 6i18 
clone HQ03t0PRO0310p1 3.19 
p21/Odc42ff^-activated kinase 1 (yeast Z96 
prapioivlCo8n^AcarbcKytase,bet8p 2.55 
chaperonin containing TCP1. subuhit 2 (b 
high-mobility group (nonhblone dvomoso 
RNA binding motif protein 8A 
ceiBbeOin 1 precursor 

twist (Drosofrtilfa^ hoffBtog (acrooeplutos 3.(K) 
enobse1,(Edph9i) 

guanine nucleotide binding protein (6 pr 12J0 

daudin 1 2.85 

ubkiuinol-cytochTome c reductase hinge p 

protein tyn^ne phosphatase, noinecept 6.80 

de6mop(akin(DPI,DPir) &14 

hypotheScal protein MGC14353 

ELAV (embryonic lethal abnoimal visnn, 

glycyl4i^ synthetase 

add phosphatase 1 1 soluble 

spltdng factor, arglnine/serihe-rtdi 5 

sdute canter fsBT^ 20 (phosphate tran 6.11 

ADP-ribosyltransferase (NAI>; pdy (ADP- 

discs, taige (Drosophila) homolog 5 3.07 

NIPSNAP. a aleoans. homolag 1 

ESTs. WeaMy simlar to similar to ankyr 

phosphoglyceratB kinase 1 

mltochondriad r1{iosomai protein L3 2.56 

solute canter famBy 6 (neurotransmltte 8.20 

chapemdn containing TCP1. subuntt 4 (d 

6Cl2/adenovinJs E1B 19kD4ntBiadIng pro 314)0 

Gbriin 2 (congenita) contradura} era 24.60 

Homo sapiens cONA: aJ23602 fis. done L 

neuronal specific transcripfion factor D 

ribose5^phateisomerBseA{ribose5 a40 

protease, serine, 15 4.50 

anSgen Mentified by monoclonal antibod aOO 

nuclear receptor subfamily 4, group A. m 

phosfMbosylglydnamMefamiyltransfer Z80 

miniduomosome ntalntenanoe deSdent (& 4.68 

CGI'139prot^ 

smaD nudear ribonucleoprotafai polypept 
CDC28 protein kinase 2 
threonyMRNA synthetase 

UV-Brepiessed sequence. HUR 7 13£0 
fflltogen-adivaled protein Mnase kinase 
Deleted In sfMandfepUHbot 1 regki 
IVpotheflcdpn)lelnFU10879 &00 
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3.82 



1Z25 



13.20 



9.20 
19.60 



15.83 



9.48 
1Z00 



10.60 



17J0 
14.00 



4.36 



5.83 



3J7 



4.00 
8.96 
4.28 

4.63 

4.66 

4.55 

4.85 
6.34 

4.91 
460 
3.85 

4.08 

6.71 



14.74 



18.40 



a70 



3.84 
&81 
4.21 
7.30 



4.63 
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134724 AR)4S239 Hs.321576 

134743 AA044163 Hs.89463 

134781 AA374372 H8,89626 

134805 AD00152B Hs.89716 

134853 BE268326 Hs.g02e0 

134859 D2648B Hs.90315 

134891 R51083 Ks.907B7 

134960 BE246400 H8.285176 

134993 BE409809 Hs.301005 

135047 AL134197 Hs.93597 

135080 A1761180 Hs.94211 

135103 NM-003428 H8.9450 

135145 AW014729 Hs.95262 

135184 U13222 Hs.96028 

135242 AI583187 Hs.9700 

135286 AW023482 Hs.97849 

135289 AW372569 Hs.9788 

135355 AK001652 H5.99423 

135371 NM.006Q25 HsJ97 

135393 L11244 Hs.99886 



ring linger protein 22 

potassiuin laige oonduclBnoe csldtin-acfi 
paraOiyiDid honnone-lks honnone 
spennlne synthase 

5^aminoimidazolMH;aibQxamide ribonude 

KIAA0007 protein 

ESTs 

acetyl-Coenzyme A transporter 
purin&^ element txnding protein B 
cydoHiependent Icinase 5, rBgulat07 su 
rcdl (required for cell diffeienfiation, 
zinc linger pntfiln 84 (HPF2) 
nudes' factor relaled to Icappa B liindih 
fbrtdieadtxKcDI 
cycHn El 
ESTs 

typofiieticai prtM MGC10924 similar to 
ATPHtependenlRNAheilcase 
protease^ sefbie, 22 

complement component 4-bindng jvotein, 



4.00 



4.00 

9.50 
5.00 



13.50 
6.46 

10.00 

aoo 



1Z00 



25.20 



6.20 
7.40 
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4^ 
4.79 



11.00 



7.00 



4.48 



4.01 



aeo 



14.60 



TABLE5B shows the aocessK)nnumt)ersfbrttioseprime]«eyslacklngunl^^^ l^eadiprobeset we have listed the gene duster numlnr from 

oiigonucteofidesweiedesigned. Gene dusteis were cornpiled using sequences derived from Genba^ These sequences were dustered based on sequence 

simiianty using Clustering and Alignment Tools (DoubleTWist Oddand CaHfomia). The Genbank accession nunters for sequences comprising each duster are Hsted in the 
'Accession' column. 

Pkey: Unique Eos pnsbeset idenlrlier number 
CAT number Gene cluster number 
Accession: Genbank accession numbeis 



Pkey 

117079 
124305 
101502 
109792 
126034 
102766 
126345 
127066 
127D99 
119243 
125875 
112054 
126979 
126992 
122318 
114699 
114793 
108305 
108393 
100867 
123731 
109700 
120715 
113702 
115113 
101045 
108554 
108573 
119052 
126522 
126805 
103768 



CAT number Accessions 



1621717J 
242183 1 
18202_^ 
754958J 
1598157J 
44641J 
1653833.1 
170345B 1 
244301_1 
1774795J 
1566433.1 
1538292.1 
171411.1 
880655.1 
292419.1 
135322.1 
15074?.! 
111550 J 
113411 1 
tigr_HT4586 
genbankJ\A609889 
genbank^F09609 F09609 
genbanlLAA292700 
genbanJLT97307 T97307 
genbanlLM256460 
entrez^J05614 J05614 
genb8n)(uAA084948 
genbanleAA086005 
149538 1 
416020.1 
439280.1 



K92325T9712S 

AW963221 AA344870 AA344871 H93331 



R49625F10874 
H60340 N91637 
U82321 H66077 
N49713N49819W03810 
R25066 R20144 R20146Z43845 
AA347668 AWg56810 Z44271 R)7065 FD7064 R13S06 
T12603T12604 
H14480N98295 
R43590F10439 
AA210954AA211007 
A]809521 H12174Z42556 
AA429743AA442754 
AA127386R15644AA127404 
AA158245AA158235 
AA071391 AA069892AAQ69891 
AA075211 AA075245 AA075126AA074946 
U14622 

AA609839 



AA292700 



AA2564eO 



AA084948 
AA086005 
R108B9 R10888 
W31912AI167491 

AA676910 AA778853 AA778865 W86800 
46922.1 W42667AI580740AI690440AI561350AW467906AW151450AI825927AiXM1716AI885600AI742213AVV24M^^ 

AA845593 Ai62371 1 N685B3 000064 AA193557 AW083868 AW1 6321 6 AAigiS95 AA522778 A1628008 AI915518 AA843508 AI926196 
AA176265 AW167963 AA992115 W93647 AW103572 A1862994 AI342059AA911719 AA176155AA024712AAD69988AA2D5591 AI591107 
A1199673A]811766AI275B32AI422233AI191B52A1096682AI580124Ai683612AA5B2453AAS275S9AA488415T3241^ 
H44848 H20477 T91695 W47039 AA0700S5 AA024795 AA32885S AA379248 AA379330 AA385580 W2S920 W036B8 AA448359 AA093a81 
AW362477 AA089997 AI350265 W93479 N9g888 AA932257 AW351459 H68590 AA663402 AA0B9771 AW087986 AI858420 AA600214 
AI970774AI857712AI683081Ai885584AW1311MAI56798lAVV002714AW189973AW075495AW168303AA9«^^ 
Ai566663 AW51 2676 AI570580 AI023690 AA44821 6 AI079853 A!422707 AA77951 6 AWQ26972 AW130Q82 AW162307 AW438646 AA709332 
AW192394 AI167350 AI217879 Ai1291 52 AA7ig509 A!350480 AA66341 8 AI003634 AW1 18546 AA180261 AA442a33 AI28^ 
Ai038759 AA846723 AI248770 AA993694 Ai280335 AI8a5107 AW518649 AA641563 AA995835 AA582521 A1276744 AA436478 AI017360 
Ai620763 A18S9887 N73926 AI076327 AI741615 Ai160617 AW172819 AI492005 AA677429 AA996334 A1693771 AI950039 A1245629 A1288515 
AI866186 T93293 AA173262 AA599779 AI680092 AW439316 AI084555 AI272672 AI583507 AW47321 9 AA738132 AW473283 A1367492 
AA995410 AI689fi24 AA206353 AI033095 A1040382 AA873630 AI221074 AI934840 AI41fi680 AA844306 R94503 AA773520 AA843169 
AA219425AA629658A1811719 AW411275AI590981 W37907AI591178AI684051 AA983238AA669347AA976239 AA704570AI628339 
AI884391A1241580AI003539AW176687AA009650 N34566AI333493Ai186070AA070827AA411683AI280884AA8^^^ 
AAfi21576 N71953 AI885888 AW076039 T15777 A1537673 AW248048 H09554 W93480 W47001 AW0791 14 AA0631 60 AA757453 R60788 
AI859431 H20478 AA2188B2 AA757465 AA100995 Ai864135 A1934209 AA070503 H47008 AA219846 W61039 W93907 AW3B5050 W37967 
W78028 AA189007 AA4791 36 R93650 AA442312 T30287 AAW7628 AA180262 AA00g649 C03B92 AW149464 AA310963 AA219693 
AA069747 R^207 AAO94784AA293615AA447848AI9B4167N90393 C05097 N56499 AW292351 AW149681 AW473258AA629322AI004409 
AW105577 AI954937 AI811070 AA902422 AW514437 AA535460 AA916877 AW517122 AA974657 AA975649 AW517130 AW517129 F31737 
W0768BAA193645AA37B994AA488273F32267W39303AAD21181 N86810AA406524AA062553AA436B01 HOB98SH1S979H40310 
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AA436789 AA232172 AW360778 W258e2 R60282 AA436530 AA378894 Ml 87461 AI940535 AA6tM2f0AA0BS514 AA36tM21 NBB243 NB4281 
AA209340 N56174 N68374AA191088 AW247691 AA24g013AA093111 AA972536 AW2S8594AA3758g3T12139 W2B166 AW243849 
AI288629 AA843996 W15260 AI188286 AW248079 R15836 

119599 genbank.W45552 W45552 

112382 genbanK.R59904 R59904 

105264 genbanlLAA227934 AA227934 

1Q0071 enlre%>28102 A2B1IS 

123315 714071.1 AM98369AA498646 

Table 6A shows 99 genes uiHcguIated nonsmokeis wilh lung cancer reiafive to smokers wilh lung cancer. These genes were selected from 59680 probesels on ttie 
Eos/AflymeirixHuOSGenschlpaiTay. Gene expression data fbr each piDbeset obtained from this ana^^ 
the lelafive level or mRNA expression. 

Pkeyi Unique Eos probeset identtiter number 

ExAccn: Ex^iiplar Accession number. Genbank accession number 

IhiioaKlD: Unlsene number 

Untaene Title: Unl^negene^ 

R1: averiQe of Al far samples ftom noivsmokers wlOi adenocaminoma divkled by the 901h pemeniHe of Al for samples from smokers wilh adenocaicinoma 

avei^ofAlfbrsanqitesfmmnon-smokerawitheifua^ 
carcinoma 



Phey 


ExAocn 


UnigenelD 


Unigens Title 


100971 


BE379727 


Ks.63213 


fsiKjf add binding protein 4, adipocyte 


101174 


L17330 


H&280 


pra-T/NK cell associated protein 


101296 


Y12490 


HS.8S082 


thyroid hormone teceptor inleractor 1 1 


101304 


AA001021 


HSi66K 


V^raU honnone receptor interactor 8 


101806 


AAS86694 


Hs.112408 


S100 calcium4)in(fing protein A7 (psoiias 


101972 


S82472 




gb:bfita -poK)NA polymerase beta {exon a 


102274 


U30930 


Hs.156540 


UDP giyocsyltransferase 8 (UOP-galactDse 


102394 


NiyL003616 H&2442 


sdisintBg'^ and matdloprotdnasedoma 


102832 


U92015 




gbiHuman dona 143789 deisctive mariner 


103010 


Y52509 


Hs 161640 


tyrosine anrinotransferase 


103439 


X98^ 




gbiH^apiens mRNA for Ggase Ifle protel 


103563 


L02911 


Hs 150402 


acthrin A receptor, type 1 ^ 


103857 


nnifOf 99 


H5.45033 


laciimal praline ridi protein 


104239 




Hs^13ffl 


douUecorttn and CaM Mnasfr^ 1 


104590 


AVV373062 


Hs.63823 


nuclear receptor subfamily 1, group 1. m 


104907 




rid- isviui 


EST8. Weakly similar to AIU1.HUMAN ALU 


106131 


BE514768 


Hs298244 


SNARE protein 


106672 


H47233 


K5.30643 


ESTa 


106872 


igooor 


Hs 18282 


KIAA1134 protein 


106960 


AA156238 


Hs.32501 


ESTs 


106971 


Z43846 


Hs.194478 


Homo sapiens mRNA; cDNA DKFZp43401572 {f 


107982 


AA035375 


H5OT87 


ESTs, Weakly similar to KIAA0756 protel 


108562 


AA100796 




gb3in26c06Al Siratagone pancreas (93720 


108599 


AB018549 


Hs.69328 




108863 


BE219231 


H&292653 


ESTs. Weakly 8MlartoT28B45hypothBQ 


109247 


AA314907 


Hs.85950 


ESTs 


109630 


R44607 


Hs.22672 


ESTs 


110193 


AI004874 


Hs.310764 


Homo sa^iiens mRNA: cDNA Dia7p434Ml»2 (fr 


110234 


H24458 


H&3208S 


EST 


110644 


R94207 


Hs.268389 


ESTs. Highly sbitOartotype II CAIitiAFI 


110886 


AW274992 


Hs.72249 


throe-PDZoontainins praAein simnarto 


111057 


T79639 


Hs.14629 


ESTs 


111950 


AF071594 


Hs.110457 


Wbt^HiTBcMtom syndrome canddals 1 


112291 


R53972 


Hs.2602e 


ESTs 


112956 


Z43784 


Hs.76893 


ankyrin 3, node of Ranvier (ankyrin 6) 


113009 


T23699 


Hs.7246 


ESTs 


113060 


BE564162 


Hs^0820 


hypotheses] fovkean FIJ14827 


113073 


N39342 


Hs.103042 


microtubule-associated protein IB 


113074 


AK001335 


Hs.31137 


prolan ^^rosine phosptiatase. receptor t 


113121 


T48011 


Hs.8764 


EST 


113125 


AA968672 


Hs^929 


hypothetical protein FU11362 


113757 


AA703095 


HS.1B631 


ESTs 


11^ 


V\52854 


H&27099 


tVpothettoal protein FLJ23293 skniiar to 


113884 


AI333076 


HS.2B529 


chromosome 12 open reading frame 2 


113936 


W17056 


Hs.83623 


nuclear rec^itorsubtonoly 1, group 1. m 


114875 


AA23560g 


H5.236443 


Homo sapiens mRNA; cDNA DKFZp564N1063 ( 


114987 


AA251016 


KS.878Q8 


EST 


115460 


AW9S8439 


Hs.38613 


ESTs 


115722 


W91692 


Hs.69609 


CSTs 


116261 


AM81788 


Hs.190150 


ESTs 


116830 


H61037 


Hs.70404 


ESTs. Weakly similartoAUKLHUMAN ALU 


116970 


AB023179 


Hs.9059 


KIAA0962 protein 


117178 


H98675 


HsJ269034 


ESTs 


117757 


AFD88019 


Hs.46732 


EST 


118283 


AA287747 


Hs.173012 


ESTs. Weakly slmllartoA46010X4lnked 


118384 


AF217525 


Hs.49002 


Down syndrome ceD adheston molecule 


118657 


AI822106 


Hs.49902 


ESTs 


120328 


AAS23278 


H&290905 


ESTa. WeaMy similar to protease [lisapi 


120404 


AB023230 


Ks.86427 


IOAA1013protaln 


120524 


AA2618S2 


H&1929)5 


ESTs 


120688 


AW207S5S 


Hs.97093 


Homo sapiens cONA: aj23004 lis. donel 



R1 

1&00 



7.50 
7.50 
13^ 
9.50 

9.00 

U50 

16.50 

7.00 
11.50 

9.50 

16.50 
13.00 

7.00 

12.50 
16.50 
6.00 
17.00 
16.50 
11.00 



9.79 
32.50 



19.50 
&00 



9.50 

aso 

7.50 

7^ 
16i0 



7.00 
&00 
17.92 



R2 

a64 

2.46 
12.00 
Z68 
Zll 



2.80 

3.94 

12.66 

2.17 

Z38 
195 

Z40 
5.00 



3.00 
Z79 
450 



a82 
Z21 

Z65 

&00 
4.63 
7.00 
6.00 
Z27 
9.00 



Z68 



Z50 
Z39 
3.50 
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121558 


AM12497 




gb2tg59l2.$1 Soar6S_tesiis_NHT Honio sap 


121676 


H56037 


H8.108146 


ESTs 


121^ 


AI024600 


Hs.98612 


ESTs 


121938 


AA4286S9 


Hs.98610 


ESTs 






Hs 98833 


EST 




AA2996S2 


H&11149& 


Homo sapiens cDNA FU1 1643 lis. done HE 


123551 






gbranShlZ^I SoaresJesti8_NHT Homo sap 


123756 


AAfi09971 


Hs 112795 


EST 


123861 


AA620B40 




gb:afB9g0U1 SoarB8_tB869LNHT Homo sap 


124371 


N24924 


Hs. 188601 


ESTs 








ESTs 


127591 




131092 


ESTs 


128252 


AA455924 


Hs.192228 


ESTs 


128426 


/U265784 


H&145197 


ESTs 


128925 


R67419 


Hs.21851 


Homo sapiens cDNA aJ12900 fe, done NT 


128945 


A1990506 


Hs.8077 


Homo sapiens mRNA; d3NA DKFZp547E184 (fr 


129105 


AI769160 


Hs.108681 


Homo sapiens brain iumor assodaled pfot 


129235 


pmnm 


Hs.126084 


KIAA1055 protein 


129506 


AB020664 


Hs.11217 


KIAA0877 protein 


i9Qi:Qc 


U09550 


Hs.1154 


oviductal glycoprotein 1, 120t(D (mucin 9 


lOUIDU 


AA305688 


Hs.267695 


UDP-Gal:b6taGlcNAc beta 1.3-gatactosyhr 


130340 


082326 


Hs^39106 


solute canter family 3 (cysfine. dbasi 


131220 


AB023194 


Hs,300855 


KIAAD977 protein 


131430 


A!B79148 


Hs.26770 


fatly add binding protein 7, brain 




NM_006152 Hs.40202 


iymphotdH^tricted membrane protein 


132458 


AA935315 


H5.48g6S 


Homo sapiens cONA: FU21693 f s. done C 


132647 


NM 006927 Hs.54432 


stsdyttransfiBfase 4B (betagaladosidase 




D49372 


Hs.54460 


smal indudble cytddne subfamily A ((^ 


132682 


A1077500 


Hs.54900 


seidogically defined cdon canoer anfig 


132747 


AA345241 


Hs^5950 


ESTs. Wealdy simOarto KIAA1330 protein 


132812 


H50333 


Hs.92186 


Leman coDed-cdl protein 


133337 


AF085983 


Hs.293676 


ESTs 


133876 


AL134906 


Hs.771 


pliosphorytase, glycogen; liver (Hers dis 


134119 


AW157837 


Hs.79226 


fasdculation and elongation protein zet 


1344S4 


AA302983 


Hs.239720 


CCR4-N0T transcription compiex. subunit 


134542 


M141S6 


Hs.85112 


insutin-Uke growth factor 1 (somatomedi 


135002 


AA448542 


Hs.251677 


G antigen 7B 


135305 


AA203555 


HS.982B8 


Homo sGtpiens cDNA FU14903 fis, done PL 
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Z95 



10.00 

15.00 

14,00 

8.93 

13.04 

liJO 

\m 

6i0 



7.00 



10.00 
15^ 

6.50 

20.00 
11.50 
17.50 
&10 



7.S0 



Z50 

4.33 
3.02 

Z08 
Z11 



4.25 
10.00 



6.15 
5.56 

2.53 
2.50 
2.63 
3.82 
5.C0 
3.00 
2.06 
Z27 
11.50 

6i0 



TABLE 6B show the accession numbers for thxe primeiceys laddng unlgenelD's for Table 6A. For eadi probeset we have fisted tlie gene duster number from which the 
dlgorajdeotides were designed. Gene dusters were compiled using sequences derived ftoriGenbank ESTs and mRlW^ These 8equeiu» were diBtared based M s^ 
similarity using Oustering and ABgnmenl Tools {DoubieTvirfst, Oddand CaHfomla). The Genbanli accession numbeis for sequences comprising eadi duster are listed In the 



Pksy: Unique Eos probeset Identifier number 
CAT number. Gene duster number 
Aooessioa' Gsnbank accession numbers 



Pltey CAT 

108562 36375 1 AA1 00796 AF020589 AA074629AA075948AA100849AA085347AA1 26309 AA079311AA079323AA0852^ 

103439 35330 1 X98266 1^1124 

123551 genbanlLM60B837 AA608837 

123861 genbanlLAA620840 AA820840 

102832 enbBZ^U92015 U92015 

101972 entiBiLSB2472 S82472 

1215SB genbaniLAA412497 AA412497 
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wo 02/086443 PCT/US02/12476 
Tafate 7A shows 98 genes down-regulated in non^mokars with lung cancer retaiiva to smokers with lung cancer. Tliese genes were sefected Iram 596B0 ptdbeseSs on the 
Eo8iSMlymetrfxHu03Genechip array. Gene expression data for each probeselotylainedfnni this anal^ 
the retalhn level of mRNA expression. 

P\KBf Unique Eos piobesetideniitier rmbsr 

ExAocn: Exemplar Accession number, Genbanic axesslon number 

UMgenelD: Unigene number 

Un^eTltla- Un^e/ie gene title 

R1: 90lh percentile of Al for samples from smokers with adenocarcinoma divided by the average of Ai for samples from non-smokers with adenocaminoma 

R2: 900iperaenIIte of Al for samples torn smokers with squamous oeacaidnm 
cajdnoma. 



Pkey 


ExAccn 


UmgeneiD 


UnlgefleTille 


Rl 


R2 


100187 


017793 


HS.7B183 


aUoteto reductase rarnfly 1, member C3 




164.10 


100380 


062343 


Hs.16551 


neuroblastoma (nerve tissue) prate&i 




77>10 


100576 


XD0356 


K8.370S6 


calcttoriin/icabdtorAwelafod polypQpfid 


10Z40 




100971 


BE3797Z7 


H8^213 


fatty acid binding protein 4, adipocyte 


463.80 




101046 


i<01160 


Hs.889 


(NONE] 


672.00 




101066 


AW970254 


Charot-Lsyden crystal protein 


66iX) 




101175 


U82671 


Urn. lentiA 

H8.3o980 


melanoma antigen, family A, 2 




77J0 


101497 


WD5150 


Hs^7034 


homeofaaxA5 


62.80 




101663 


NM_003528 


Hs^178 


H2B histone famny, member Q 


78.00 




101677 


NMJD00715 


ite.1012 


complement component 4'bfaiding protdn, 


186^ 




101745 


M88700 


nS.1 50403 


dbpadecarboxytase (aromatic L-etrano ad 


804)8 




101941 


S77583 




guHERVKIO/HlJIMMTV reverse transcnptase 


99.20 




102125 


NM_00645o 


rlsj288215 


sidj^bansferase 




103.10 


102242 


U27185 


H8.B2547 


reiinoic acid receptor respcnder (tazaro 


67.00 




102340 


U37055 


it&278657 


macrophc^e sfimiilafing 1 {hepatocyte gro 


71.60 




102369 


U39840 


HS.299B67 


hep^ocyte nudearfktor 3, alpha 




69.70 


102457 


NM.001394 


HS.23S9 


dud specificity phosphatase 4 


153^ 




102659 


U71207 


Hs^279 


eyes absent (Drosoptiila) liomolog 2 




65.70 


102796 


AL079846 


Hs.107019 


symi^ekin; HunBngfin inlgrecthg protd 




58.80 


102829 


NMJD08183 


Hs.80962 


neurotenan 




268.80 


103207 


XTZ^ 




gb:Human endogenous retrovims mRNA for 


70.00 




103242 


X76342 


Hs^ 


doohol dehydrogenase 7 (class IV), mu o 




212.10 


103260 


X78416 


H5^155 


casein, alpha 




130.70 


103351 


X89211 




gb:H.sapiens ONA tor endogenous fotrovfa' 


64J0 




104212 


AB00229B 


Hs.173035 


KIAA0300 proton 


66.80 




104252 


AF002246 


Hs.210663 


celt adhesion mdecuie with honwiogy to 


63.80 




104258 


AF007216 


HS.S462 


solute carrier My 4, sodium blcarbon 


94.40 




105024 


AA1 26311 


Hs.9879 


ESTs 


66.20 




106260 


AI097144 


K5^25D 


ESTs. Weakly similar to AliJLHUIIMN ALU S 




...74.60 


106440 


AA449563 


Hs.l 51393 


gtutamatsKvstelne Ggase, catalyllc sub 




71.10 


106566 


BE2g8210 




gb:601118016F1 NIH„MGCJ7 Homo sapiens c 


73.20 




106605 


AW772298 


Hs^l103 


Homo sapiens mRNA; cONA OKFZpS64B076 (fr 


83.80 




105614 


AA648459 


Hs.335951 


hypothetical protein AI^Ol 222 




62.30 


106654 


AW075485 


H8.286049 


phosphoserlne aminotransferase 




20Z40 


106999 


H93281 


Hs.10710 


hypothetical protein F1J20417 




89.60 


108700 


AA121518 


Hs.193540 


ESTs. Moderaiely stmOar to 2109260A B c 




66.40 


108810 


AW295647 


Hs.71331 


hypothetical protein MGCS350 




95.50 


108857 


AK001468 


Hs.62180 


anlQin (DrosophDa Scraps homdog). act 




63.40 


109597 


AA989a62 


Hs,293780 


ESTs 


85.00 




109691 


T65568 


Hs.12860 


ESTs 




58.70 


109704 


Ai743860 


Hs.12876 


ESTs 




60.60 


110942 


1^63503 


Hs.28419 


ESTs 


76.40 




111722 


R23924 


Hs^3596 


EST 


74.60 




112891 


7^)3927 


H8.293147 


ESTs, fctoderatfily similar to A46O10 X-S 


64.80 




112992 


AL157425 


HS.13331S 


Homo s3fieo& mRNA; cDNA DKFZp761J1324 (f 




76.70 


113073 


M39342 


Hs.l 03042 


microtubuld-assQdated piotoln IB 




120.20 


114251 


H15261 


H5.21948 


ESTs 


127.20 




115230 


AA27B300 


Hs.124292 


Homo sapiens cDNA: FU23123 fis, done L 


174.00 




115291 


BE545072 


H8.122579 


hypothetical protein FU 10461 




91.00 


116815 


AWS05328 


H5.180842 


ribosomal protein L13 


66.40 






AUUB70K97 


nS<09f Dl 


CO ID, weoKiy Similar lo unri^nUMAn ucAin 




226.60 


11S965 


AA001732 


Hs.173233 


hypothetica] protein RJ10970 


82.80 




116107 


AL133916 


Hs.172572 


hypothetical protein F1J20093 




361.60 


116552 


020508 


Hs.164649 


hypothetical protein DKFZp434H247 


69.00 




116571 


045652 




gb:HUik46S02848 Human adult lung J dired 


64.20 




118468 


N66741 




gb:yz33g0&«1 Morton Feld Cochlea Homo 




63.50 


120484 


AA253170 


Hs.g6473 


EST 


81.60 




120983 


AA398209 


Hs.97587 


EST 




81.10 


121034 


AL389951 


Hs.271623 


nudeopoiln 50kD 




66.20 


121423 


AW973352 


Hs.290585 


ESTs 


64.40 




122553 


AA451884 


Hs.190121 


ESTs 




60.40 


122946 


AI7t8702 


Hs.308026 


major histocompatibility complex, dass 


186.60 




123130 


AA487200 




gb:ab19i02^1 Stratagene lung (937210) H 




80.20 


124472 


N52517 


Ms.102670 


EST 


71.00 




124526 


N620g€ 


Hs^185 


ESTs, WSsakly similar to JC7a26 amino aci 




104.90 


125489 


H4gi93 


HS.1249B4 


ESTs, Moderatdy8linllartoAliJ7jnJMANA 




7Z00 


125731 


R61771 


H5.26912 


ESTs 




69.90 


125747 


N&L002884 


Hs.a65 


RAP1 A member of RAS oncogene im^ 


69.00 




126020 


H79863 


Hs.l 14243 


ESTs 




62.40 


126547 


U47732 


Hs.84072 


trsismemtirane 4 superf amily member 3 




g2,^n 


126866 


R38438 


H5.182575 


sohite canter family 15 (H^Anpflde tra 




60.10 
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127472 


AA761378 


Hs.192013 


ESTs 


127610 


AA960867 


>fe.1S0271 


ESTs, Highly similar b uimamed protein 


127742 


AW293496 


Hs.1 80138 


ESTs 


127987 


At022103 


Hs.1 24511 


ESTs 


128233 


AW889132 


Hs.11916 


ribokinase 


128420 


AA6S0274 


Hs.41296 


&ronec6n leucine rich transmembrane p 


128766 


AW1 60432 


Hs.296460 


cfaniofscisl dsvdopment protBin 1 


129014 


AW935187 


H8.170162 


KIAA1357 protein 


129215 


AB040930 


Hs.126085 


KIAA1497 protein 


130090 


H97878 


Hs.132390 


zinc finger protein 36 (KOX 1Q 


130385 


AW0876QQ 


Hs.155223 


stanniocalcin 2 


130732 


AW890487 


Hs.63984 


cedherin 13, H-cadherfn (tieari) 


131025 


AB0409Q0 


Ks.6189 


KIAA1467 protein 


131241 


BE501914 


Hs.24654 


Homo sapiens cDNA FU11640 &, done HE 


131775 


AB014548 


H&3ig21 


KiAA0548 protein 


132240 


AB018a24 


Hs.42676 


KIAA0781 protein 


132856 


NiyL001448 


H5.58367 


9lypic8n4 


132977 


AA093322 


Hs.301404 


RNAUrufingmolifprd^S 


133749 


L20852 


Hs.10018 


solute canier family 20 (ptiosphate Iran 


133818 


AI110684 


Hs.7645 


fibrinogen, B beta polypeptide 


134264 


AF149297 


H5.8087 


NA&5p^n 


134265 


M83772 


Hs.80878 


flavin corislning monooxyBenase 3 


134346 


XB4002 


Hs.82037 


TATA box binding protein (TBP)-as80datB 


134395 


AA456539 


Ks.8262 


lysosomal-assoclated membrane protein 2 


135047 


AL134197 


HS.93S97 


cyciin-dependent Idnase 5, regulatory 8U 


135056 


N75765 


Hs.93765 


lipoma HMGIC fusion partner 


135309 


AIS64123 


Hs.42500 


AOP-ribosylalionfoclDr^5 



70.20 
64.00 
85.20 



64.20 
63.80 



64.40 
7&20 
97.80 



133.20 
341.00 

66.00 



71.40 
70.40 



78.90 

loago 



58.53 



139.60 
64.60 



71.00 
88.40 

59.30 

64.30 
23Z53 

75.80 

iaa3o 



TABl£ 7B straws the accession numbers for tliose primskeys lacldng unlgenelD's for Table 7A. For eacli probeset we have listed the gens cluster number from which the 
oTigonucleotldes were designed Gene dusters were compiled using sequences derived ftm Genbank ESTs and mRNAs. These sequences were dustered based on sequence 
simOarily udng ChBtertrv and AlignmentToob (DouUeTwist Oatdand CaBliDniia). The Genbaidc aooession numbers fbr sequences comprisbiB each duster are iislsd in the 
"Accession* cdumn. 

Pkey: Unk^ue Eos probeset idenlffier number 
CAT number. Gene duster number 
Aocesskm: Genbank aooession numbers 



Pkey 



CAT number Accessions 



103207 30635_'4 
120358J 



X72790 



BE298210 AI672315 AVV086489 BE298417 AA455921 AA902537 BE327124 R14963 AA08521O AW274273M 
AI885095AI476470 AI2876S0A1885299 AI965381 AW592624 AW340136AI266556 AA45S390 AI310815 AA484951 
116571 genbank.D45652 D45652 
118466 genbanK.N66741 N66741 
101046 entrezJ(01160K01160 
101941 entrez_$n583S77583 
103351 entrBO(8S211X89211 
123130 genbanK.M4872D0 AA487200 
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Table 8A shows 1720 genes either up or domwegulated In Iuhq tumors or chronically diseased lung relative b a broad coliecSon of over 40 disfinct nonnd body tissues. 
QtfonicaDy diseased lung samples represent chronic non-maOgnant lung diseases such as fibrosis, emphysemab and l}ronchitl& These genes were selected from 39494 
piobesels on the Eos/Af^metrixHu026enechip array. Gene expresskm data Ibr each piobeset obtained fron Oils analysbw^ 
normaGzed value lefledkig the rdlalive level of mRNA expression. 

Pkey: Unique Eos probeset identilier number 

ExAocn: Exemplar Accession mimber, Genbank accession number 

UnigenelD: Unigene mimber 

Un^e Title: Unigene gene title . 

R1: 701hpeicenlileofAl for lung tumors divided by 9001 percentile QfAl for normal bjng 

R2: 700i percenfiia of Ai for chronfcally <fiseased lung divided by 90th percentile of Al for normal brng 



Pitey 


ExAccn 


UnigenelD 


DnigeneHfle 


R1 


R2 


300097 


AI916973 


Hs.213603 


ESTs 


5.46 


4.69 
056 


300117 


AW189787 


Hs.147474 


ESTs 


058 


300197 


AI686661 


Hs.218286 


ESTs 


426 


&44 


300201 


A1308300 




gb:ta90c06j(1 NQ CGAP Bm20 Homosapien 


062 


083 


300225 


AI989963 


Ks.197505 


ESTs 


1.68 


1.75 


300247 


AVV274682 


Hs. 161394 


ESTs 


1.08 


13B 


300256 


AI469095 


Hsi98241 


Transmembrane proteaset serine 3 


086 


1.00 


300337 


AI7Q7B81 


HsJI02090 


ESTs 


&80 


009 


300362 


242308 




gb:HSC0FB121 normalized inftet brain cON 


4.18 


1178 


300374 


At^9947 


Hs.314158 


ESTs 


ZS9 


438 


300387 


AW270150 


Hs.254516 


ESTs 


1^ 


153 


300440 


AI421541 


Ks.146164 


ESTs 


3L98 


5.2s 


300441 


R10367 


Hs.307921 


EST. VfeaMy simibr lo 223LHUMAN ZINC F 


ai8 


O80 


300449 


A1362967 


Hs.132221 


hypothetic^ protein FIJ12401 


043 


062 


300469 


AW1 35830 


Hs.233955 


hypothetic^ protdn FU20401 


016 


083 


300552 


XB5711 


HS.21B38 


hypothetical protein FU1 1 191 


4.10 


9.75 


300627 


VV27363 




gb:ab37d01x1 Stratagene HeLa cell s3 93 


4i0 


1160 




AW11S822 


Hs.1 28757 


ESTs 


191 


S.86 


300716 




Hs 126280 


hypotheScal pTOtsbi FLJ23393 


1.00 


092 


300738 


AI623332 


Hs.1 30541 


KIAA1542 protein 


m 


1.71 


300777 


AA235361 


Hs^840 


K1AA1527 protein 


448 


022 


300790 


AI492471 


Hs 188270 


ESTs 


1.29 


1.18 


300832 


rVviio 1 1 1 


Hs^2061S 


ESTSi Weaidy sindlar b TO384S transcrtp 


&fi1 


056 


300836 


244942 




caldum charuiel a1|rfia2-delt93 subunit 


4.80 


034 


300B38 


AI<iS9897 


Hs.1 92570 


hypotiieQcd piot^ FU22028 


1.70 


181 




AWd^5B02 


HS.2B5S01 


Homo sapiens cONA FLJ204^ Gs, done KA 


4.56 


7.91 


^nRQ7 




r», ■£/ uw^ 


ESTs, Weaidy simiiar to T17233 hypothefi 


2.23 


1.58 


0UU9£0 


AAfyTVUVin 
/VWW400U 




nh:ahQ3a1 0^1 Sfrataenfi fatal iBtbia 93 


Z13 


3.50 


30(360 




H£ 152454' 


ESTs 


2:74 


4.46 


0UU90I 






^Ts. UteaUv similar ta unnamed DfDldn 


1.00 


1.00 


OUU9UC 


AAS93373 


Hs.293744 


ESTs 


1.46 


1.51 


300967 




Hs.269439 


ESTs 


039 


1.30 


300987 


AW45Q840 


HS.14B590 


ESTs, Weddy simiiar to AF208846 1 BM^ 


1.49 
016 


1J» 


0UU9OO 


Al 927208 


Hs^0&952 


ESTs 


037 


vH)1050 




Hs.28fi516 


ESTs, weaidy similar to S6989D mitogen i 


3L23 


1.94 


^1(38 


AAS77S7n 


1^185918 


ESTs 


076 


14.28 


301157 


AA729905 


KS.231916 


ESTs 


ai6 


085 


301162 


AI142118 


Hs 129D04 


ESTs 


1.66 


7.18 


301170 


AA737594 


Hs.247606 


ESTs 


4.40 


042 


301192 


AIB0&7S1 


Hs.121183 


ESTs 


6.38 


11.59 


301193 


/v\f 90 1 1 0 


H^19R3S0 


ESTs, Weaidy similar to JCS423 2-liydro)ty 


4.35 


7.78 


301267 


AW2g7762 


Hs.2556% 


ESTs 


1.G6 


1.61 


301281 




Hs 19QS86 


ESTs 


Z19 


1.78 


301341 


yUB191S8 


Hs 208229 


ESTs 


076 


076 


301382 


AA912B39 


Hs.1 63369 


ESTs 




1.81 


301407 




Hs.126830 


ESTs 


1.48 


1.51 


301452 




Hs 159955 


ESTs 


051 


1.46 








Ifaififted 

lillUuDU 


2.40 


SX)2 


301494 




Hs.1 310% 


ESTs 


2.79 


141 


301521 


AI733621 


Hs.1 33011 


zinc finger protein 117 (HPF9) 


067 


067 


301531 


AI077462 


Hs.1 34084 


ESTs 


152 


076 


301560 


AI878959 


HsJ3737 


splicing factor, arginine/seiina-rich 1 


7,41 


11.92 


301676 


Z43570 


Hs.27453 


ESTs, Moderately 8imilartoG01251 Rarp 


031 


10.70 


301690 


R)5865 


Hs.ia8323 


ubiquitiii-conjugaSng enzyme E2E 2 (homo 


Z70 


4.22 


301718 


F07744 


H5.7987 


DKFZP434F162 protein 


4i0 


078 


301799 


AA3B4252 


H5.286132 


D15F37{pS8udogene) 


5.93 


7.04 


301804 


AA581004 


H5.62160 


anlllin (Drosophlla Scraps homolog). act 


1.70 


076 


301822 


X17033 


Hs.271986 


Integrin, alpha 2 (C049B, alpha 2 subuni 


1.58 


1.36 


301846 


R20002 


Hs.6823 


hypotheticai protein FU10430 


1X0 


1.00 


301868 


m608 


Hs.13861 


ESTs. Weekly similar to pH sensitive max 


2.8B 


049 
3.80 


301862 


T7B054 




gb:yc97g09.r1 Soares IrdiBnt brain INIB H 


2.28 


301905 


Ai991127 


Hs.1 17202 


ESTs 


IXK) 


1.00 


301948 


AA344647 


Hs.116724 


aldo^ceto reductase family 1, member B11 


5.2B 


128 


301960 


AW070252 


Hs.27973 


K1AA0874 protein 


5.3B 


048 


302011 


T91418 


Hs.125156 


transcriptional adaptor 2 (ADA2, yeast 




042 


302016 


N40834 


H5.23495 


hypothetical protein FU11252 


m 


1.25 


302041 


NW_C01501 


Hs.129715 


gonadotropin^leasing hormone 2 


0.71 


099 
1.71 


302072 


AJ23e381 


Hs.132576 


pared box gene 9 


1.60 


302094 


AI286176 


Hs.6786 


ESTs 


062 


1.20 


302095 


AWD44300 


KS.137S06 


Homo sepiens BAC ctone RP1 1-120)2 fhim 7 


175 


493 


302148 


AW269618 


H8.23244 


ESTs 


ao4 


087 



124 



H$.159003 
Hs.159140 
Hs.41143 
Hs.159297 
Hs.166361 
Hs.175563 
Hs.23240 
H8.194625 

Hs^2676 
Hs.211956 
H8^18028 



Hs.227277 

Hs^473 

Hs.240770 

Hs.6335 

Hs^41578 

H5^0424 

Hs.187032 

Hs.48956 

HS248572 

Hs^72100 

Hs.173560 

Hs.102696 

H5.198273 

Hs.146274 
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303131 AWD81061 Hs,103180 

60 303195 AA082211 Hs.233936 

303196 AA082298 Hs.59710 

303216 AA581439 Hs.15232B 

303222 AA333538 Hs.204501 

303234 AA132255 Hs.143951 

65 303251 AVU340037 Hs.115897 

303295 AA205625 Hs.208067 

303287 TB0O72 

303316 AF033122 

303467 AA398801 

70 303506 AA340605 

303552 AA359799 Hs.224662 

303598 AA382B14 

303637 AF056083 

303655 AA5047Q2 Hs.258802 

75 303756 AI738488 Hs.115838 

303856 AA968569 Hs.160532 

303893 N88S97 Hs.113503 

303907 AW467774 Hs.171880 

303946 AW474196 Hs.306637 

80 303978 AW513315 

303981 AV\S13804 H&278834 

303990 AW515465 

303998 AW516449 

303999 AW516611 
85 3040DS AW517947 
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Hs.132127 
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Hs.130683 
Hsl78953 
Ks.134079 



Hs.13423 
Hs.14125 

Hs.323397 
HS.10S887 



Hs^4879 



ESTs 

bansient iBceptor potential channel 6 
UDP-Gal:belaGlcNAc beta 1.4- galactosylt 
phospholnosilide-spectfic phospholipsse 
Idliercell iedin^ke receptor subfami 
Homo sapiens mRNA; cONA DKFZp564F112 {It 
Homo sapiens mRNA; cONA DKFZp564N0763 (f 
Homo sapiens cDNA FU13496 fis. done PL 
dynein, cyto|dasmic, Pghtintennediate 
mudh 4, Iracheobroncliial 
synaptonema) complex protein 2 
C03-epsilon-assodated protein; anSsens 
adaptor-related pratdn complex 4, epsil 
KIAA1054 protein 

Homo sapiens mRNA; cONA DKFZp564J062(£r 
sine oculis homeobox (Drosophila) homoto 
UDP-N-acetyi^cosamine:a-1 .3-D-mannosld 
nuclear cap binding protein subunit 2, 2 
SWl/SNF related, mkrfoc associated, acfi 
U6 snRNA-assodated Sm4iks protein LSmS 
Homo sapiens cDNA FLJ13540 fis, ckme PL 
ESTs 

gap junction protein, beta 6 (connexin 3 
tiypothetica) protein FU22965 
SMS3 protein 

odd Oz/ten-m homoiog 2 prosophBa, mous 
MCT-1 protein 

NAOH detiydiogenase (ubiquinone) 1 beta s 
ESTs 

Homo sapiens, done IIMAGE2623731, mRNA, 
S164 prc^n 

gb:yu66g1 1.r1 Weizmann Olfactory EpBhel 
ESTs 

gb:Homo sapiens mRNA for hnmunoglobuiin 

gb:Hunian immunoglobuiin heavy chain, V-r 

gbiHuman autonomously reipficating sequen 

hypothefical protein FU20920 

gb:HQmo sapiens (clone WR4.10VH) anMh 

KiAA1555 protein 

ESTs 

gb:Homo sapiens mRNA for immunoglobulin 
hypothetical protein FU10494 
gb:H^ens mRNA for variable region of 
ESTs. Moderately similar to putative DNA 
hypotheQca! protein FLJ20051 
gb:H.sapien5 reananged Ig heavy chain { 
hypothetical protein LOC57822 
ESTs, Weakly sbnilar to T17330 hypotheti 
hypothetical protein FU12894 
Homo sapiens cDNA: FLJ23137 fis, clone L 
gb:Homo sapiens done 2A1 scFV anitbody 
RAB22A, member RAS oncogene family 
peptidylprolyl isomerase (cydophilin)-) 
gb:H^apiens J-csW receptor mRNA 
kinesi/i My member 13A 
zinc finger protein 180 {HHZ168) 

NM23^8 

0C2 protein 

myosin, light polypefriidei regiiatory, n 

ESTs 

ESTs 

hypotheSca! protein FLJ10S34 
ESTs 

protocadherin 12 
ESTs 

Homo sapiens done 24468 mRNA sequence 
p53 regulated PA26 niidear protein 
ESTs 

ESTs, Weakly simflar to Homoiog of rat Z 
ESTs, Weakfy similar to unnamed protein 
gb£STg6097 Testis I Homo sapiens cONA 5 
pliosphaHdic add phosphatase type 2C 
ATPase. {Ua*)IK+ transporting, beta 4 po 
ESTs 

glucose phosphate isomerase 
karyopherin Omporfln) beta 3 
polyrnerase (RNA) tl (DNA directed) polyp 
Homo sapiens cONA FLJ12363 (is. done MA 
gbato43c12J(1 NCi_CGAP.Ut1 Homo sapiens 
ESTs. VteaklysimnartoALUIJtUMANALU S 
gb:xu71a11jc1 Na_CGAPJQd8 Homo sapiens 
gtexl68R15Jt1 NCI.CGAP.UI2 Homo sapiens 
gbcxp70b11jc1 NCLCGAP.Ov39 Homo sapiens 
gb»l66h02j(1 NCLC6AP_Ut2 Homo sapiens 
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304234 
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Hs.169476 


304430 


AA347682 
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AA932805 
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306325 
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Hs.210546 
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Hs.275865 
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HS.276Q18 
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ribosomal protein S27a 
eukafyoSc translaiion elongalion 
eb:FB21B7 Fetal brain, Stratagene Homo s 
gb:FB26F2 Fetal brain. Stiatagene Homo s 
gb:FB7C1 Fetai br^n, Stratagane Homo sa 
ribosomal pn)teinS14 
gb:yb42d06.s1 Stratagene liBtal sjrieen (9 
gb.7b73g01.s1 SIralagene ovary (937217) 
gb7c04c12.s1 Stratagene lung (937210) H 
rfljosomai protsln. large, PI 
gb:yi87g02.s1 Soares placenta Nb2HP Homo 
gb9n)31a06.8l Scans Infant brah 1NIB H 
gb:yr78b06.8l Soares fetal Bver spleen 
gb7y82d08.8l Soares_mu!0plej5cl8rQSiaL 
gbzdSShO&sl Soares.fiBtaUieaiLNbHH19W 
rOiosQmal prataint large, PO 



6.50 
1J8 
2.15 
&88 
5.59 
6.55 

ai6 

Z64 
0.53 
6.49 
Z90 
1.00 
a79 
4.28 
a47 
1.34 
3.40 
2.93 



protsasome (prosome, macropaln) 26S sub 
gb2p38g12.s1 Stratagene muscle 937209 H 
glyceraldehyde-3-ptiosptiate dehydrogenase 
gb:EST54044 Fetal heart II Homo sapiens 
gb3v2600SAl SoaresJUiHMPiLSI Homo sapi 
gbzx82c11.s1 Soares ovaiy tumor NbHOT H 
gb2xO2c05.s1 8oaras_totaUetu5_Nb2HF8„ 
gIyceFB]dehyde-3fhosph3te dehydrogenase 
serine (or cysteine) proteinase tnhibtb 
gb:nh85e08£l Na.CGAP^.1 Homosaptan 
fsnffin, fight polypeptide 
nbosonid prolan S23 

gb3nn75h11.s1 Na_CGAP_Co9Honu)sa|^ 
gbam13g09.8l NCLCGAP.Oo12 Homo sapiens 
iaAA1685 protein 
PRQ2047 protein 
VDnenOn 
ESTs 

immunoglobuHn heavy constant gamma 3 (G 
gb2UB9)i06s1 Soaras.testis.NHT Homo sap 
gb:db99c04^1 Stratagene hmg (937210) H 
gb3ir72a12.s1 NCI_CGAP.Pr24 Homo sapiens 
ESTs 

gbml01g08.s1 Na.CGAP^Lym3 Homo sapiens 
EST. WeaMy sbnilarlo EFIDJIUMAN ELONG 
gb:ag57d12.s1 Gassier Wilms tumor Homo s 
glyo»atdehyde-3-phosphate dehydrogenase 
gb:ag37e01.s1 Jia bone marrow stroma Hom 
nuclear factor of kappa light polypepGd 
flM44f07.8l 8oaresJiBlaUhiar.^pteen„ 
EST 

immunoglobulin heavy constant gamma 3 (G 
gb:ai10fD8.s1 So3res_p3ra&iyroU.tumor.N 
gb:nx10c08.s1 NCLCGAP.GC3 Homo sapiens 
hypothetical prQlainFU11726 
EST 

gb3iz12e05.s1 NCLCGAP.GCBI Homo sapiens 
honogloUn, alpha 2 

gb:si09h02.8l Soares_parathyroidJumorJI 
iflwsomal protein S18 

gb»e29a125l Na_CGAP_Pr2S Homo sapiens 
gb:oe29c12.s1 NCLCGAP_Pr25 Homo sapiens 
gb:nw31e04.s1 Na_CGAP_GCB0Homosapens4.49 
gb*.ai67a05.s1 Soares.tesQsJJHT Homo sap 4.91 
ff tfffl wnM^ proteini target PO 
gb:of34a02.s1 Na_CGAP_KU6 Homo sapiens 
gb:ak72b06.s1 Barsiead spleen HPU^ Horn 
gb:ak64a06.s1 Barstead spleen HPLRB2Hom 
ribosomal protein, large, PO 
gb:oh63h08.8l ND.CGAP_Kid5Homosapiena 
gb:ro(21h02.s1 NCI_CX3AP_GC3 Homo sapiens 



3.32 
1.00 
1.42 
ZiB 
5.38 
4.16 
0.55 
1.95 
Z10 
a33 
1.33 
3.68 
177 
7.16 
2.47 
6.78 
0.90 
&46 
1.00 
5.68 
1.48 
1.76 
1.00 
5.31 
0.78 
3.11 
4.38 
2.13 
1.20 
1.16 
&68 
2.21 
3.36 
1.00 
&44 

ai9 

1^ 
7JSr 
4.78 
0.89 



ai9 

&12 
1.66 
2.34 
0^ 
110 
032 



gb:am08bO7.s1 SoaresJJFLT.GBCSI Homos1.56 

ribosomal prot^SS kinase, gOkDipalyp &21 

EST 1.98 

gb:ok03g03.s1 Soares_NFU.T„GBC_S1 Homos 7J8 

gb»k78g02.8l Na_CGAP_GC4 Homo sapiens 7.19 

gb»k85h1U1 NCLCGAP_Wd3 Homo sapiens 6.50 

gb»g21a07.s1 NQ.CGAPJ^Sl Homo sapiens 4.21 

tRNA Isopentenylpyrcfihosphale transferas Z20 

gb»o60g04.s1 Na.CGAP.Ui5 Homo sapiens 184 

gb»(S3hQ5.s1 NGI_CGAP.HN3 Homo sapiens 1.60 

inteiteukin 21 receptor 1.65 

ribosomal protein S18 3.78 

EST, Moderately similar b JC4662 ribos 4.30 

gb»p09d05.8l Na_CGAPJ<kl6 Homosapiens 0.95 

hypothatical protein FLJ20284 3.19 

gb»q35e09.s1 NCL0GAP.GC4 Homo sapiens 4.67 

gb»q72e12.8l NCLCGAPJ<ki6 Homo sapiens 3.92 
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8.23 

I. 61 

II, 67 
4.18 
176 
118 
11.34 
11.03 
1.16 
5.40 
4.42 
10.96 
5.99 
1.00 
133 
1.15 
14.11 
a23 

I. 20 
110 
183 
1162 
0.68 
&14 
3.70 

II. 01 
4.24 
11.66 

I. 23 

iai7 
too 

II. 59 
1.37 
4.61 
115 
8.14 
1.18 
6.66 
7.53 
166 
1.40 
0.68 
a87 
186 
6.54 
102 
9.10 
0.79 
1.00 
10.20 
1142 
0.70 
8.71 
9.40 
081 
029 
4.11 
4.25 
1.40 
5.21 
1.01 
1.12 
7.90 
6.S9 
20.69 
13.48 
9.13 
5.25 
170 
5.35 
1.12 
2.26 
6.32 
5.74 
145 
4.10 
7.44 
6.27 
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TABLE SCsInws the genornfeposKon fix those Hoys in Tabte 8A(ackiag unigena UTs and aooessbn Rumbos. For each predtcfsd exon, wb have fisted (he genomtc 
sequence eouioe used for ptedtotlon. Nudeotfde locflGons or each predicted exon aie also Bsted. 

Pkey: Unique number corresponding to an Eos probeset 

Ref: Sequence source. The 7 digit numlieis in this coluinn are Genbank Identifier {Gl) numbers. "Dunham I. et ai; refers to the pubiicaibn entitled The ONA 

sequence of human chramosoRe 22." Dunham L et aL, Nahjre (1999) 402489495. 
Strand: indicates DMA strand fham which exansnverepredieled. 
Ntj)osition: Indicates nucleotide posifions of predicted exons. 
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TABLE 9A: Polenta) Therapeutic, Diagnostic and Picgnoslfc taigels fiw Therapy of 

Tatte 8A thoas about 1 31 2 genes ufHagulatgd h lung tumon (Miiding tqnannus eel eaidnomas, adenocaicinonias, sraaO cell carcinomas, gramAmdoua and eandnoU 
tunnis)celalive to normal Irnly tissues. These genes weisselecledliam about S9G80prDbesels on (he EosMf^metiixHutlSGenediip array. 

Table 9B show the accesstoo numbers for those Floy^laddngUrilgenelO^ for taUeSA. Fv each piobeset we have listed fin gene duster number from which the 
otaonudeolldes were desalted. GenedusleisiwreGOinraedustaBsequencesderivadtDmGenbankESTsandmiVUs. ThesesequeneesmerechBteredbasedonsequence 
sinMty using Clustering aid Alignment Took (OwUeTwistOaldandC^^ TheGenbankaccessionniimbersfarsequencesoomprisbigeacheluslerareistedinVie 
'Accession' column. 

Table 9C show the genomic positioning tor DMM PIiay^lacldngUnigene ID'S and accessknminA 1^ each predicted axon, we have listed the genomic 

sequence source used fDT predtolion. Nudeolide locations of each pedicbd ann are dso Istsd. 
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Pki^: Unique Eos probeset identifier number 

ExAccn: Exemplar Accession nunber. Genbank accession number 

UnlgoielD: Unigene number 

Undone THIe: Ur^enegenefitle 

R1: Average of lung tumors {including squamous cell caicmomas, adenocardnomas. small cell carcinomas, granulomatous and cardndd tumors) divided by the 
average of normal lung samples . _ ^ 

Pkey 
400195 
400205 
400220 
40QZ77 



ExAficn 



UnlgenelD 



400286 



400298 
400301 
400303 
400328 
400419 
400512 
400517 
400560 
400664 
400665 



X06256 

XD7820 

AA032279 

XD3635 

AA2427S8 

X87344 

ARn4545 

AF2423BB 



400749 
400763 
401027 
401093 
401203 
401212 
401411 
401435 

401464 AF039241 

401714 

401747 

401760 

401760 

401781 

401765 

401797 

401961 

401985 AF0S3004 

401994 

402075 

402260 

402265 

402297 

402408 

402420 

402674 

402802 

402994 

403137 

403306 NM.006825 

403329 

403381 

403478 

403485 

403627 

403715 

404044 

404076 

404101 

404140 

404165 

404165 

404210 

404253 



Unigene Title 

NM.00705r:Homo sapiens ZW10 mteractor 
NM.006265*:Homo sapiens RA021 (S. pomba) 
EosContni 
EosOontnl 
EosContnri 

integrin, alpha 5 (libTonectln receptor, 
matrix metaltopn^^ 10 (stromelysin 
sU transmembrane epithelial antigen of 
1 



Hs,149609 
Hs^S8 
Hs.61635 
Hs.1657 

Hs.79136 UV-I protein, esbogen legulatBd 
Hs.180062 transporter 2. ATP-tinding cassette. 
Target 

NMlO30878'*>1omo sapiens cytochrome P450. 
iengsin 

NMl030878':Homo sapiens cytochrome P450, 
NM_002425:Hamo sapiens matrix metallopro 
NM.002425:Homo sapiens matrix matallopro 
NM.002425:Homo sapiens matrix metatlopro 
NM.003105':Hanio sapiens sortliin^tilated 
Target Exon 
Target Exon 

C12000586*:g1i6330167|dl4|BAAB6477.1| (A 
Target Exon 

Cia00045r:gil7512178|pir|[T30337 polypr 
EN8P00000247172^HYPOTHEnCAL 126.2kDa 
C140003d7*:g!}7499698[p{ri{733295 hypoth 
hislone deaoclylase 5 

ENSP00000241802*:CONA FU1 1007 FiS. CLON 
Homo sapiens keraOn 17 (KRT17) 
Target Exon 

NM.00555r:Homo sapiens keraSn 16(fbca 
Taiget Exon 

NM.002275*:Homo sapiens keratin 15 (KRT1 
Taiget Exon 

NMl021628:Homo S£|pSens serine carboxypep 
dass 1 cytokine receptor 
Target Exon 

ENSP00Q002S1056*:Ptasma membrane cateium 
NM_001436':Homo sapiens fibrilIarin(f^L 

Target Exon 
Target Exon 

NMl03092D':Homo sapiens hypothetical pro 
C1000823'^il10432400Iemb(CAC10290.1| (A 
Target Exon 

N!yL001397:Homo sapiens endoflieiln conver 
l4ML002463*:Homo sapiens myxovirus (influ 
NM_00538r:Homo sapiens nudeolin (NCL). 
transmembrane pMn (63kO}, endoplasmi 
Target Exon 

ENSP00000231B44nEoolropic vims Integra 
NAt022342:Homo sapiens Unesin protein 9 
C3001813*:gi|12737279)ref|XPJD12163.1| k 
Taiget Exon 
Target Exon 

ENSP00000237e55*:OJ398G3.2 (MOVEL PRGTEI 
NIUL016020*:Komo sapiens CGl-75 protein { 
C8000950:gi|423560lplrl|A47318 RNArbindi 
NM.006510:Homo sapiens ret finger proiBi 
ENSP00000244S62:NRH dehydrogenase (qulno 
TagetExon 

NM.005936:Homo sapiens myeldidAyniphokt 
NM.0210S8*:Homo8epi8n8H2BhistoneM 
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IXX) 


1.00 
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2^8 


2M 


7.68 


9.72 


1.00 


1.00 


1.04 


2.24 


132.45 


400 


43.86 


74X0 


1.00 


1.00 


tn 


1.65 


0.87 


1.80 


156.55 


253.00 


1.00 


ZOO 


3.67 


87.00 


1.00 


1.00 


20.26 


45.00 


1.36 


1.07 


3.26 


3.22 


1.00 


91X0 


7.63 


24.00 


1.00 


1.00 


1.00 


155.00 


1X0 


66.00 


1.00 


4oaoo 


1.00 


^00 


1.00 


64.00 


3.82 


49.00 


2.02 


40.00 


128.43 


68X0 


1.74 


35X0 


26.47 


10.50 


10!33 


4.61 


4.13 


Z7Q 


1.44 


Z10 


1.41 


- .1.86 


1.00 


177.00 


61.84 


47.00 


1.00 


1.00 


1.58 


1.39 


2.09 


35.00 


1.00 


92.00 


28.87 


13.00 


t.00 


t.44 


7.44 


243.00 


1.00 


70.00 


1.37 


1.43 




19X0 


IjOO 


43X0 


1X0 


61X0 


1X0 


iiaoo 


28.13 


136.00 


20.23 


76.00 


6.30 


29.33 


1.30 


3S.00 


1.00 


54.00 




91.00 


1.00 


1.00 


1.42 


1.44 


1.00 


54X0 


1.00 


117.00 


5.93 


ia77 


1.00 


1.00 
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Ue 071 
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407378 
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Hs.57776 
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407430 
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gbiHomo saptsns protein tyrosino phospha 


407453 
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^iHomo ssif^ans mRMA for axonemd dynobi 
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ubi^uiOn specific protease 18 
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Ue ooocc 
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407782 
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0 lUu caiciunruiiiQing prmcin nc 


407790 
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nOmo sapiens cuina ruv ihooo ns, ctono ru 


407B11 






t^rsiBine luKn supenamiiy i» oMr aniogoii 


407839 


AA045144 


ns.16i56D 


ESTs 


407944 
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Ue eon 
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A/\tn'yi 
40o1^ 




Ue AODOil 

nS.4/024 
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408482 
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408522 
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ESTs 


408545 
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ESTs 
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H5.226558 


ESTs. Moderately similar Id ALU4.HUMANA 
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Hs.46677 


PR02000 protein 
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ESTs, Moderately similar to PC4259 ferri 
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408783 


AF192522 


Hs.47701 


NPCI (NiemannWc disease, type CI, gen 


408790 


AW5B0227 


H&47860 


nBUiobophic ^^nsine Unssfii (ecseptOTi 


408805 


H69912 


Hs.48269 


vscdfiis vdstsd ldn£5e 1 


408841 


AW438865 


H8.256862 


ESTs 


408873 


AL046017 


Hs.182278 


c^odufin 2 (phosphorytasd kbiase, ddt 


408908 


BE296227 


Hs.250822 


serine/lhreonine kinase 15 


408992 


MSSS133S 


Hs.71642 


guanine nudsolide Undlng protdn (6 pr 


408996 


AI979168 


Hs.344086 


giycoproton (transnienitirane) nmb 


409015 


BE389387 


Hs,49767 


NM.004553:Honio sapiens NADH dehydrogenas 


409038 


T97490 


Hs^002 


anati tndudbfe cytokine subfamily A (C^ 


409041 


AB03302S 


Hs.50081 


Hypothefical protein, XP 051860 (KiAAl 19 


409077 


AA401369 


Hs.190721 


ESTs 


409093 


BE243834 


HS.S0441 


CGM>4 protein . 


409103 


AF251237 


Hs.1 12208 


XAGE-1 protein 


409142 


A1136877 


H&50758 


SMC4 (structuisl mafntsnance of chramoso 


409187 


AF154830 


HS.S0966 


carbampyl'fiiosphate synthetase 1, mitocb 


409228 


AI654298 


Hs^1695 


ESTs. We^ sbnitar Id 210926QA B celi 


409234 


A)879419 


Hs.27208 


ESTs 


409268 


AA625304 


Hs. 187579 


ESTs 




AA5769S3 


H5JJ2972 


hypottiefcal protein FLJ13352 


409361 


NM 005982 


Hs.54416 


sine ocuils homeobox (DrosoiJhQa} homolo 


409404 




Hs.129056 


ESTs 


409420 


Z15008 


HS.544S1 


landnini garnma 2 (ntcein (1001^), kalM 


409430 


R21945 


Hs.346735 


spDcktQ factor, arglnine^rine-rich 5 


409446 


Ai561173 


Hs.67688 


ESTs 


409506 


Nli/L006153 


H&54589 


NCK adaptor protein 1 


409522 


AA075382 




gtK2in87b0.s1 Slratagene ovarian cancer 


409582 


AM01369 


Hs.190721 


ESTs 


409632 


\IV74001 


Hs.55279 


seruis (or cysteine} proteinase inhitiito 


409705 


M37762 


Hs.56023 


br^n-dedved neurotrophic factor 


409719 


Ai769160 


Hs. 108681 


Homo sapiens bran tunor assrxaaled prol 


409731 


AA12S985 


Hs.56145 


ttiymosin, beta, (denGfied In neufoldst 


409744 


AW8752S8 


Hs.56265 


Homo sapiens mRNA; cDNA DKFZp586P2321 ^ 


409757 


NA/L001898 


Hs. 1231 14 


cysfatift Sti 


409866 


AVV^1S2 




gb:UI-HF-BI^0p-2jr+11-O-Ul.r1 NIH_MGC_5 


409893 


AW247090 


Hs.57101 


rninlchnKnosome maintenance deficient (S. 


40^02 


A1337658 


Hs 156351 


ESTs 


409935 


AW511413 


Hs.278025 


ESTs 


409956 


AW1 03364 


Hs.727 


inhbin. beta A (acBvin A, acGvin AB a 


40^58 


NMJ301523 


Hs.57697 


tiyduronan syrAiase 1 


410001 


A5041036 


Hs!57771 


kdlikr^ll 


410032 


BE065985 




gb:RC3-BT0319-120200-014-a09 BT0319 Homo 


410037 


AB020725 


Hs 58009 


K1AA091 8 protein 


410044 


BE566742 


Hs.58169 


higtily expressed in canes'! rfcii In teuc 


410048 


W7$467 


Hs.58216 


proline osddase homdlog 


410076 


T05387 


Hs.7991 


ESTs 


410102 




Hs.279727 


Homo sapiens cDNA FLJt4035 lis, done HE 


410153 


BE311S26 


Hs.15830 


hypotttetical protein RJ12691 


410166 


AKQQ1376 


Hs.59346 


tiypottietical protein FU10514 


410193 


AJ132592 


Hs 59757 


ziric finger (volein 281 


410274 


AA381807 


Hs.61762 


tiypoxia-ln(iuc3)le protein 2 


410309 


BSM3077 


Hft97ft1<a 


ESTs 


410340 


nVf lO&Dww 


Hs 1 12188 


liypotiietical protein inJ13149 


410348 


AW182663 


Hs.95469 


ESTs 


410407 


X66839 


Hs.63287 


carbonic aniiydrse IX 


410418 


D31382 


HS.6332S 


transmemtirane protease, serine 4 


410438 


ABQS7756 


Hs.45207 


hypothetic^ protein K1AA1335 


410553 


AW016824 


Hs!255527 


hypotheScal proteh M6C14128 


410555 


WZ7235 


Hs.64311 


a di^ntsgrin and metaltoprcteinase doma 


410561 


BES4025S 


Hs.6994 


Homo sapiens cDNA: RJ22044 is, done H 


410681 


AW246890 


Hs.65425 


caiUndin 1, (28kD) 


410781 


A1375672 


Hs.1 65028 


ESTs 


411027 


AFD72099 


H&67846 


ieukocyte imnuiftogtobu6n4{ke receptor, 


411074 




Hsj68137 


adenyt^ cydase activating pdype|3tide 


411089 


AA^6454 




ceil division cyde 2-tIke 1 ^HTSLRE pr 


411152 


BE0691S9 




gb:Q^BT03794)10300-105-g03 BT0379 Homo 


411248 


AA551538 


Hs.334605 


Homo sapiens cDNA FLJ14408 fis, done HE 




ADftlPf^Q 
MOUI09*(9 






411263 


RF7Q7R0? 


na.asvwv 


kinesinJikfi 6 fmitotic cenlromeT^'assoo 


411365 


M76477 


Hs.289082 


611/12 gangficuide activalDr protebi 


411402 


BE297855 


H3.69855 


NRA&ielalBd gme 


411573 


ABQ29000 


Hs.70823 


KIAA1077 protein 


411579 


AC0flS2S8 




U6 sidVJA-assodaled Sm^e {Hotein LSm7 


411617 


AA247994 


Hs.90063 


neurocalcin delte 


411732 


AA059325 


Hs.71642 


guanine nucleofide binding protein (G pr 


411773 


N^y)06799 


Hs.72026 


protease, serine, 21 (tssDsin) 


411789 


AF245505 


Hs.72157 


Adfican 


411800 


N39342 


Hs.103042 


microtubule-assodated protein IB 


411945 


ALJ033527 


Hs.92137 


v-fflyc avian myelocytomatosis viral oncog 


412115 


AK001763 


Hs.73239 


hypotttetical protein FUlOgOl 


412140 


AA219691 


Hs.73625 


RAB6 interacting, kinesin-iike (rabkines 


412276 


BE262621 


Hs.73798 


macipphsge migiaOon Inhibito7 factor ( 


412464 


T78141 


Hs.22826 


ESTs. Weakly similar to IS5214 salivary 
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hypottiellcal protein FU13346 
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412659 


AW753BSS 


Hs.74376 


412719 


fWWIDvlU 


Hs^16 


412723 


AA648459 




412811 


H06382 




412817 




Hs.74619 




A A 191671 
MnlclO/O 


Me WtKf 




BCIMIU.99 


nSkfOcDo 




lOOsVI 


HS.TO117 


413011 


MtVUDOI to 


Hs.fi21 




M(n99l 


Lie 7C 109 




nUMOlol 


Hs.75184 




AF292100 


noiiuwio 


413142 


M81740 


Hs.75212 


413223 


m f wb 1 Q£ 


Hs 191866 


413248 


T64853 




413273 


U7^79 


HS-7S2S7 


413276 


BES6308S 


(te.633 


413281 


AA861271 




413364 


BE53fi91& 


Hs 137516 


413385 


M34455 


Hs.B40 


413409 


AI638418 


Hs.1440 




AA129640 


Hs 19fl0fiS 
nii> icwQy 




DC9'«n7RR 
DCaOU/ OQ 




413554 


AA^IQl^ 
fVW 10 l*KI 


M<:7(i^96 


413573 


AiTqiccq 




413582 


AUV9QfiftA7 

AVVM004/ 


Hs.71331 


413597 


MVVOUcDOO 


nSf 1 1 f loo 


t 1 CDaU 


DCIO/409 




413691 




Ue 7i^7P 


413719 


DCM900U 


no</3*t90 




U17760 


Hs.75Sl7 




MKWC40 


Hb.35406 




Z15005 


Us7«t7S 
n9«>90/o 




MM lO^f 0 


Hs.1 84492 
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ns»940 lu 


Ai'ViA^ 

llOtfhJ 
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414142 
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nIDOOOVrf 


i9nQfm 

no^l^uwO 


AiAOAK. 


DCi40U/£ 


Ue 7Cfl.Sn 

nStf ooou 


AiAVTtl 


nW9fU£04 


UaRRQ 
nStOOs 


41 Ml 7 


Pc/DOtOU 


no. r 0000 
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Ue7CQf\Q 
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Y(X)285 


Ue 7e47^ 


414618 
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R79015 
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41 WOO 


S78298 


Hs.76688 


414696 


AHVKM9n 


TJBtlVtJ IV 


41^711 
•114/11 


nlOIW44u 


Ue 9Da7^C 


414/ 10 


n9«M40 


Ue in7QR7 


414/ 0£ 


AUtMinQ7fi 


Hs.771S2 


414/ 4# 


inflR79 


MB779nA 
ns.rrcU4 


414/01 


Ain7799fi 


Ue779ee 
nStf/AOQ 


AiATJA 
4l4//«t 


Au£419 


Ue7797A 
nB.//£/4 


4140UD 
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Ue 77^90 


4140U9 


nl4040a9 


ns./ rODO 


414olZ 


A/ZfOO 


ns.r f OOf 


4l40£0 


AUOOfU 
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AiAma 
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Ue77^9 
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4101 do 


l/IOOOD 


Ue OfiRQAA 


4io^r 
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Ue 79An9 
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no/ /OU 


Ue 91X99 


410200 


AAQ^fl/tQQ 


nS.I^UooO 


41S295 


R41450 


H$.K46 


415339 


NM.015156 


Hs.76398 


415669 


NM_005025 


H$.78S89 


415674 


BE394784 


Hs.78596 


415709 


AA649850 


Hs.278558 


415735 


AA704162 


H8.120811 


415799 




HS.225B41 


415817 


U88967 


Hs.76867 


415857 


AA8S6115 


H8.127797 


415989 


Aia67700 





dfactomedin related ER localized protei 
E8T6 

hypothetical proteiii AF301222 
ESTs 

pioteasome (prosome. macnpain) 26S subu 

znc finger protein 281 

H2A liistDne fsDnlly, nernber Y 

interteuldn enhancer binding lador 2, 4 

bigiycan 

mannose receptor, C type 1 
chitinase ^{ke 1 (carSage gtycopiote 
RP42hoRiolog 
ornithine decarboxylase 1 

ESTs 

hypothetical protein DKfZp547J036 
stem4oop (hisfone) binding protein 
interfeion-slimulated protabi, 15 IcOa 
transcription factor BliilAL2 
fidgetin-rilce 1 

indoleamlnefyrrole 2i3 dioxygenase 
DEAD/H (AsfKGiu-Ala-Asp/His} box polypep 
ESTs 

hypolheiica] protein RJ12443 
secretograran II (chromogranin C) 
ESTs 

hypoiheticai protein M6C5350 
ESTs 

gb:RC1-HT0375.1202(XM)1 1-e06 HT0375 Homo 

ATPa5e.aassVI,^pe118 

small inducible cytokine subfamlty A (Cy 

laminin, beta 3 (nicein (1251(D). kaiinln 

ESTs, Higtily simlar to unnamed protein 

centromere protein E (31 2ifD) 

ESTs 

ESTs 

Honio sapiens cDNA FU1 2981 fis, done NT 
syntaxin 1A (brain) 

serine (or cysteine) proteinase inhiUto 
Homo sapiens cDNA F1J14438 fis. done HE 
Homo sapiens cDNA I=IJ11448 fis, done HE 
WAS protein family, member 1 
Charot-tjeyden crystal protein 
phosphogtuconate dehydrogenase 
hypothetical protein FtJ10036 
KIAA0182 protein 
uridro monophosphate Idnase 
hypothetical protein MGC2721 
ufaiquitin cart)0}Q4-tBnn!nal esterase LI 
insuMce grow&i factor 2 receptor 
hypotheecai protein MGC10764 
interieiddn enhancer binding fiactor 1 
hypothefioBi protein MGC12702 
Ntemann*Plcl( (fiseas^ ^/pe CI 
Homo sapiens cDNA RJ13522 lis, clone PL 
ESTs 

minichremosoma maintenance defident (S. 
cent ro mer e protein F (35Q/4WIdD, mitosin 
enhancer of zeste (Drosophna) homolog 2 
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445818 


8E045321 


Hs.136017 


ESTs 


1.00 


1.00 


446873 


AA250970 


Hs.251946 


pQiy(A)-fainding pioteih, cytoplasraic 1-1 


49.42 


54.00 


445885 


AI734009 


Hs.127699 


KIAA1603 protein 


1X0 


132.00 


445696 


AF070623 


Hs.13423 


Homo sapi^ done 24468 mRNA 6equenc8 


1.00 


1.00 


445903 


AI347487 


HS.13Z781 


class 1 cytoidna receptor 


IXX) 


36.00 


445932 


BE046441 


HS.33355S 


Homo sapiens done 24859 raRNAeoquenoe 


2.41 


2.88 


445982 


BE410233 


Hs.13$01 


pescadQIo (ziebrsfish) homoioQ It i^tei 


1.60 


1.35 


446078 


AI339982 


Hs.156061 


ESTs 


100 


42.00 


446102 


AW16B067 


Hs^17694 


ESTs 


1.00 


too 


446157 


BE270828 


Hs,131740 


Homo saptens cDNA: FU22562 lis, done H 


1.70 


1.53 


446269 


AW283155 


Ks.14559 


hypoMcalpn)eeinnJ10540 


mi 


48.00 


446292 


AF081497 


Hs.279682 


Rh ^e CstycopnDlBto 


1.55 


1.26 


446293 


AI420213 


Hs.149722 


ESTs 


1.00 


ZOO 


446423 


AW139655 


Hs.150120 


ESTs 


1.10 


4.19 


44642B 


AW082270 


Hs.12498 


ESTs. WeaMy similar to ALU4 JHUMAN ALU S 


0.53 


3.26 


446432 


AI377320 


Hs.150058 


ESTs 


1.00 


5.00 


446528 


AU076540 


Hs.15243 


nudeotar pratefn 1 (120kp) 


1^ 


1.31 


446574 


AI310135 


HS.33S933 


ESTs 


3.89 


72.00 


446619' 


AU076643 


Hs.313 


seoBted phosphoprQtein 1 (osteoponfin, 


3a03 


20.23 


446636 


AC002563 


Hs.15767 


dtron (rho4ntefac&t9, &erir)e/tttfeonin 


4.19 


5.07 


446783 


AW138343 


HS.M1867 


ESTs 


2.82 


9.47 


446839 


BE091926 


Hs.16244 


mitofic st^dte coiied-coH r^ated prot 


iia28 


28.00 


446849 


AU076617 


Hs.16251 


deavrage and potyadenytation spedfic fa 


3.26 


2.94 


445856 


AI814373 


H8.164t75 


ESTs 


636 


11.30 


446872 


X97C58 


Hs.f6362 


pyrimidrnergic leoeplor P2y, G-pnotein c 


1.96 


Z03 


446880 


A1811807 


Hs.108646 


Homo sapiens cONA FU14934 tis, done PL 


94.90 


113.00 


446921 


AB012113 


Hs.16530 


smail indudble cytddne subfantily A {Of 


1.67 


330 


446969 


AKD01896 


Hs.16740 


ttypoMcd protein FIJ11036 


2.82 


ai2 


447022 


AW291223 


Hs,157573 


ESTs 


1.00 


moo 


447033 


AI357412 


Hs,157601 


ESTs 


7.15 


107.00 


447078 




Hs.9914 


ESTs 


47.24 


24.00 


447081 


Y13896 


Hs.17287 


potassium inwaidiy-fectiiyiiig channel, s 


0.12 


17.88 


447131 


NMJD04585 


HsJ7466 


retiftoic add leoeptor res(K»ider (tBzaro 


0.97 


1.48 


447149 


BE2S9857 


Hs.326 


TAR(HIV)RNA4itndin9prolein2 


1.24 


1.26 


447153 


AA805202 


Hs.315562 


ESTs 


1.00 


54.00 


447164 


AR)26941 


Hs.17518 


Horno sapiens cig5 mRNA. sequence 


1.00 


67.00 


447178 


AW594641 


Hs.192417 


ESTs 


3.42 


50.00 


447250 


A)87e909 


HS.17B83 


piDt^ phosphatase 1G (formejly 2C), ma 


1.60 


1.S2 


447289 


AW2470t7 


H&36976 


melanoma antigen, family A, 3 


1.00 


1.00 


M7342 


AI199268 


Hs.19322 


Homo sapiens, Similar to RIKEN cDNA 2010 


28.63 


1.00 


447343 


AA2S6641 


Hs^94 


ESTs, Highly similar to S02392 alphas 


146.62 


51.00 


447350 


AJ375572 


Hs.172634 


ESTs 


1.00 


1Z0O 


447377 


N27687 


H&334334 


U«knscKpQon tactor AP-2 alpha (acGvat 


ZS5 


63.00 


447415 


AWg37335 


Hs.28149 


ESTs, Weakly simliar to KF3B^HUMAN KINES 


asi 


1.13 


447425 


Am37A7 


Hs. 18573 


acylphosphatase 1. eiyVtaoq/b (common) 


t.00 


35J10 


447519 


U46258 


Hs.339865 


ESTs 


59.89 


49JK) 


447532 


AK000814 


Hs.19791 


hypothetical protein FU20607 


1.23 


1.63 


447534 


AA401369 


Hs.190721 


ESTs 


1.00 


17.00 


447636 


Y10043 




f^igh-tnobili^ group (nonhistone diromoso 


1.41 


1.11 


447688 


N87079 


Hs.19236 


Target CAT 


1.00 


^0 


447733 


AF157482 


H5.19400 


MAD2{mitoii6anestdelideRt. yeast h 


1.17 


1.12 


447769 


AW873704 


Hs.320831 


Homo sapiens cDNA FU14S97 lis, done NT 


6.47 


5.95 


447802 


AW593432 


Hs.161455 


ESTs 


a73 


Z34 


447650 


AB0182g8 


Hs.19822 


SEC24 (S. oetevisiae) r^ated gene famii 


86.45 


11&00 


447924 


AI817226 


Hs.313413 


ESTs, Wealdy similar to T231 10 hypottisti 


1.00 


1.00 


447973 


AB011169 


H5.20141 


similar to S. cerevfeiae SSM4 


aso 


427 


448030 


N30714 


HS.32S960 


memtjrane-spanning 4-<1omains, sut^mQy A 


4.13 


14Z00 


448105 


AI538613 


Hs.298241 


Transmemiuiane protease, serine 3 


1.15 


2.24 


448243 


AW369771 


Hs.52620 


mtsgrin, beta 8 


'\BM 


1.00 


448278 


W97369 


Hs.11782 


ESTs 


a97 


1.90 


448290 


AKD02107 


Hs.20843 


Homo sapiens cDNA FU11245 lis, dons PL 


1.00 


1.00 


448296 


BE622756 


Hs.1{^9 


Homo sapiens cDNA FU14162 fis, done NT 


Z42 


Z17 


448357 


BE274396 


H3.108923 


RAB38. member RAS oncogene famHy 


1.44 


1.08 


448390 


AL035414 


Hs.21068 


hypothetical protein 


1.00 


43.00 


A49m 


AW504732 


Hs.21275 


liypothe&al prola&i FU1 101 1 


263 


Z48 


448569 


BE382657 


Hs.21486 


signal transducer and acSvator of trans 


1.84 


2.S3 


448663 


BB14599 


Hs.106823 


hi^theticd protein MGC14797 


3.29 


48.00 


448572 


AI955511 


Hs.225106 


ESTs 


1.00 


21.00 


448733 


NM_0Q5S29 


Hs.187958 


solute canfer MIy 6 (neurotransmitis 


1.82 


1.08 


448741 


8E614557 


Hs.19574 


hypothstica) protein M6CS469 


2.48 


1.92 


448757 


A1366764 


Hs.48820 


TATA box tnnding protein (TBP)-assod^ 


23.53 


20.00 


448775 


AB02S237 


H5.3B8 


mufix (nudeoside d^hosphato Gnlsd moi 


234 


1.97 


448826 


A1580252 


Hs.293246 


ESTs, Wealdy similar to putative pISO IH 


74.07 


62.67 


446630 
446844 


AUD31658 


Hs^161 


hypothelicai piotdn <U310O13^ 


1J7 


1.31 


AIS81S19 


Hs.177164 


ESTs 


1.00 


31I» 


448988 


Y09763 


Hs.22785 


gamma^aminobutyric add (GABA] A reoepto 


1.84 


1.96 


448993 


AM71630 




KIAA0144 gene product 


1.63 


1.49 


449003 


X76342 


Hs.389 


alcohd dei^rogenase 7 (dass IV), nni o 


1.00 


1.00 


449029 


N28989 


Hs^l 


sdute carrier faniiiy 7 (cattodc aniio 


1,97 


Z26 


449040 


AF040704 


Hs.149443 


putative tumor suppressor 


Ql97 


1.56 


449048 


Z45051 


Hs^920 


simSar to S68401 (catlle) gjlucase indue 


27.13 


90.00 


449053 


AI625777 


Hs^766 


ESTs 


8.33 


Am 


449054 


AF148848 


H5.22934 


myoneurto 


73.85 


104X0 


449101 


AA2D5847 


H8.23016 


GpiotsiiKoupled rec^tor 


2.SB 


27jO0 
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449167 


T05095 


Hs.19597 


KIAA1694 protein 


449207 


AU)44222 


Hsi3255 


nudeoporin ISSkO 


449228 


Ami 07 


Hs.148590 


protBin related wifh psoriasis 


449230 


BE613348 


Hs^l1579 


melanamaoei] adhesion molecute 


449305 


A\638293 




gbii09b07 j(1 NCI.CGAP.GC6 Homo sapiens 


449318 


AW235021 


HsJ6531 


Homo sa^te, Similar to RIKEN cDNA 5730 


449446 


060730 


Hs.57471 


ESTs 


449467 


AW205005 


Hs.197042 


ESTs 


449523 


NM_000579 


Hs.54443 


chanoKine (C^ mofiQ receptor 5 


449722 


BE280074 


H3.23960 


cydinBI 


449978 


H06350 


Hs.135056 


Human DMA sequerice from done RP&-850E9 


4S0OO1 


NftlL001044 


K5406 


solute carrsT family 6 (neuiotransmBte 


450098 


W27249 


Hs,8109 


bypoSieiical protein FU21080 


450101 


AV649989 


Hs^4385 


Human htx^7 mRhlA sequence 


450149 


AW969781 


Hs.132863 


Zic {an^ member 2 (odd>p£d78d Orosoptil 


450193 


AI916071 


H$.15507 


Homo sapiens Fanconi anemia compismentai 


450221 


AA32B102 


Hs^4641 


cylosteteton assodated protein 2 


450372 


fiE218107 


KSJ202436 


ESTs 


450375 


AA009647 


Hs.e650 


a dtsintegnh and metalloprotetnase doma 


450447 


AF212223 


HsiSOlO 


liypothefical protein P15-2 


450568 


AL050078 


Hs.25159 


Homo sapiens cONA fU 10784 fis. done NT 


450689 


AI70150S 


HSi 202526 


ESTs 


450584 


AA872605 


Hs.25333 


tntedeuMn 1 receptoTi type 1) 


450701 


H39960 


Hs.288467 


Homo sapiens COMA aJ12280 fis. dona MA 


450705 


U903O4 


Hs.25351 


iroquots tiomeobox jvoteiin 2A (1RX-2A) ( 


450832 


AA401369 


Hs.190721 


ESTs 


450837 


R49131 


HS.262&7 


ATP-dependantmteiferoa response protd 


450983 


AA305384 


Hs.25740 


ER01(S.oen9Vis{ae}JIIe 


451105 


A1761324 




gbMi60b1lj(1 (4GL06APjGo16 Homo sapiens 


451110 


AI955040 


Hs;{65398 


ESTs, Weakly sindarb lranslonna8oiK 


451253 


H48299 


Hs^6126 


daudinIO 


451291 


R392B8 


H8.6702 


ESTs 


451320 


AW496974 




diacyigiycerd kinase, zeta (1041d3) 


451380 


H09280 


Hs.13234 


ESTs 


451386 


ABQ29006 


H5.26334 


spastic paraplegia 4 (autosanral dominait 


451437 


H24143 


H&3194S 


liypottidical protein FU11071 


451462 


AK00O367 


Hs.28434 


tiypothelica) protein RJ20360 


451524 


AKD01466 


Hs^16 


tiifpottiefteal protein 1^10804 


451541 


BE279383 


Hs.26557 


I^phn}n3 


451592 


AI605416 


Hs^13897 


ESTs 


451635 


AA018899 


Hs.127179 


oypticQene 


451743 


AA401369 


Hs.190721 


ESTs 


451808 


NM_003729 


Hs^Q76 


RNA 3*460110131 phosptide cydase 


451807 


W52854 




hypothsticd protdn nJ23293 sirnilar to 


451871 


AI821005 


Hs.118599 


ESTs 


451952 


AL120173 


H8^1663 


ESTs 


452012 


AA307703 


Hs.279766 


1 J- - at— * - — 1 II ..ill II f JIA 

Knespi lamay unmusf vk 


452046 


AB018345 


HS.Z7G57 


KIAA0802ptislBln 


452194 


AI694413 


Hs.332849 


dfactory rsoeplor, fesrily 2, sublarrity 


452206 


AW340281 


H5^074 


Homo sapiens, done ayiAGE:36Q6519, mRHA. 


452240 


AA401369 


H8.190721 


ESTs 


452255 


AK0OO933 


Hs^l 


Homo sapiens cONA FU10071 68. done HE 


452281 


T93500 


Hs.28792 


Homo sapiens cDNA FU11041 fis. done PL 


452291 


AF015592 


Hs.28853 


C0C7 (c^ diviston cyde 7. S. cerevist 


452295 


BE379936 


H&2B866 


programmed ceil dealti 10 


452304 


AA025386 


H8.61311 


ESTs. Weakly sin^ to S10590 cysteine 


452340 


HK/L002202 


HsiOS 


iSLI transcription factor, UM/hocneodoma 


452349 


AB028944 


Hs29169 


ATPase, Ct^ V), 1 1 A 


452367 


U71207 


Hs.29279 


eyes atsent (Oro5(^l)ita) homolog 2 


4S2401 


NMl007115 


Hsi93S2 


tumor necrosis fador, alptia^nduced pro 


452410 


AL133619 




Homo sapiens mRNA; cDNA OKFZp434E2321 (f 


452461 


H78223 


Hs.108106 


transcfipfion fadof 


452571 


W31518 


Hs.34665 


ESTs 


452613 


AA4S1599 


Hs.23459 


ESTs 


4S2899 


AW295390 


Hs^13D62 


ESTs 


452705 


H4dS05 


Hs^46005 


ESTs 


452747 


AF160477 


K3.61460 


Ig superMy leoeptor IXIR 


452787 


AW294022 


Hs.222707 


K1AA1718 protein 


452795 


AW392S55 


H3.16878 


iiypothetical protein FU21820 


452823 


AB012124 


Hs.30696 


transcription factor^ 5 (bade tiellx 


452833 


6E559681 


Hs.30738 


KIAA0124 protein 


452838 


U65011 


HsJ0743 


preferenfla&y expressed anfigen in mela 


452882 


AA401389 


Hs,190721 


ESTs 


452865 


AW173720 


HS.34S605 


ESTs. Weaidy sbrilar to A47582 B«dl gr 


452934 


AA581322 


Hs.4213 


tiypotheticaJ pnteiR M6C16207 


45^ 


X95425 


Hs.31092 


EphAS 


452976 


R44214 


Hs.101169 


ESTs 


453028 


AB006532 


Hs.31442 


RecQ protein-like 4 


453095 


AW2g5660 


H5i52756 


ESTs 


453102 


NiyL007197 


Hs.31664 


(ifizzled {DrosQphUa) homdoo 10 


453103 


A1301052 


Hs.153444 


ESTs 


453120 


AA292891 


Hs^1773 


pregnancy-induced gniwtb inbilAor 


453153 


N53893 


Hs.24360 


ESTs 


453160 


AI263307 


HS.2398B4 


H28 tfistonefamOy, meivber L 


453197 


AIS16269 


HS.1090S7 


ESTs, Vtteak^simBar to AUi5_HUMAN ALUS 



1.61 


2.36 


2.36 


1.56 


1.15 


1.15 


206.65 


151.00 


17^ 


45.00 


26.39 


35.00 


1.00 


1.00 


1.00 


1.00 


56.80 


2ia86 


15a03 


1.00 


2.16 


Z85 


1.17 


1.45 


1.79 


2.38 


1.00 


69.00 


1.00 


1.00 


2a85 


34.00 


1.00 


1.00 


1.00 


1.00 


51.26 


93.00 


123.20 


181.00 


1,00 


19.00 


1.00 


aoo 


1.00 


100.00 


1.89 


1.65 


1.00 


45.00 


25.17 


17.00 


90.92 


90.00 


a33 


1.70 


15.02 


124.00 


1.00 


moo 


3.02 


2.29 


1.00 


1.00 


Z92 


18.00 


6.90 


6.67 


35.75 


7Z00 


1.00 


69.00 


1^ 


Z10 


1.13 


1.07 


1.68 


1.33 


1.00 


1.00 


1.52 


1.92 


4.95 


17.00 


13.55 


31.00 


1.55 


35.00 


1.81 


2.53 


1.00 


22.00 


143 


226 


S&J99 


19.00 


1.67 


4.09 


9.31 


53.00 


13.42 


17.00 


39.03 


94.00 


153.01 


34aoo 


1.95 


23.00 


42.33 


61.00 


1.17 


2.14 


1.00 


13.00 


1.09 


1.42 


54.49 


53.00 


1.00 


3ZO0 


1.26 


1.99 


TAAT 


35.00 


54.61 


102.00 


1.39 


1.32 


1.00 


26.00 


1.00 


1.00 


112.67 


1.29 


1.00 


1.00 


1.00 


1.00 


7.91 


75.00 


ai6 


1.92 


17435 


1,00 


98.26 


17,00 


1.55 


1.00 


1.73 


1.19 


1.00 


1.00 


1.88 


1.98 


1.80 


1.60 


0.77 


1.50 


1.00 


1.00 


1.00 


1.00 


1.23 


1^20 


1.00 


83i)0 


1.00 


2m 


1.00 


134.00 
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4S3210 AL13316t Hs^360 
453240 At9G9S64 HS.1G6254 
453317 NM„002277 H&41696 
453323 AF034102 Hs^l 
453331 AI240665 H&8SS0 
453392 U23752 HS.32S64 
453431 AF094754 HsJNm 
453439 AI57243& Hs.32976 
453459 BE047032 Hs^7789 
453563 AW608g06.con^ 
453633 M357001 Hs.34045 
453775 Nlil.002918 Hs^120 
453830 AAS34296 Hs^953 
453857 AL080235 Hs.35881 

453867 A1929383 Hs^Q32 

453883 AI638516 Ks^7524 

453884 AA3S5925 H&36232 
453900 AW003582 Hs^6414 
453922 AF053306 Hs.3e708 
453941 U39817 Hs^BZO 
453964 >U961466 Hs.12744 
453968 AA847B43 Hs^ll 
453976 BE463830 H$.163714 
454024 AA993527 Hs^3907 
454034 NM.000691 Hs.575 
454042 T19228 Hs.172572 
4540S9 NML003154 H8J7048 
454066 mas^ Hs^70S8 
454098 W27953 Hs.292911 
454241 BE144666 

454417 AI244459 Hs.1 10826 

454439 AW819152 H&1S4320 
455175 AW993247 

455601 At366680 Hs^16 
456237 AA203682 

456321 NIbL001327 Hs.87225 

455475 NIIL000144 Hs.95998 

4S6SQ8 AA5Q2764 H$.123469 

456534 X91195 Hs,10Q623 

456736 AW248217 Hs.1619 

456759 BE2591S0 Ks.127792 

456990 rayL004S04 Hs.171545 

457200 U33749 Hs.f97764 

457234 AW966360 Ks.143S5 

457465 AW301344 Hs.122908 

457489 AI693815 Hs.127179 

457646 AA7256S0 Hs.112948 

457733 AW974812 H&291971 

457819 AA057484 Hs.35408 

458092 BE545684 H5.343556 
45809B BE550224 

456207 T28472 H5.7655 

458242 8E299588 Hs.28465 . 

458247 R14439 Hs^9194 

458679 AW975460 Hs.142913 

458778 AW451034 HS.32652S 

456933 A1638429 Ks.24763 

459352 AW810383 HsJS)6B28 

459670 FD1020 H&172004 
459702 AI204995 



hypotiieticat protein aJ10867 

hypo(li6tiC3l protein DXFZp566I133 

tefBiiR, hmt, acidici 

solute earner family 29 (nucteoslde tra 

ESTs 

SRY (sax detemiining region YH»3x 1 1 
glycitereoeptDr.beta 
^ guanine nudeolide binding protein 4 
ESTs 
Hs.181163 

hypotheticatpfDt8inFU20764 
f^pBcaSon factor C (adhf^ 1} 4 (37 
ESTs 

DKFZP586E1621 protein 

i^ypotiieticai protein DKFZp434N185 

cofadDT required lor Spl transcriptiona 

KtAA0186 gene product 

ESTs, Weah^ similar to ALU6.HUMAN ALU S 

budcfing un'mMbilBd by berudmidazotes 1 

Bloom syndrome 

ESTs 

Homo sapiena. done HyiAGE:3351295, mRNA 
ESTs 

hypothetical protein FU23403 

aidehyda dehydrogenase 3 My. member 

hypothetical protein FU20Q93 



cdcMnfcalcitonin^ated polypeptid 
ESTs. H^hly similar to S60712 band*pr 
gb:Cft/l2-HT017WM1099^17-c02 HT0176HoniO 
trinucleotide repeat contaSnlng 9 
OKF2FS6601646protein 
gbd^C2-6N0033-1802Q04)14^ 81W033 Homo 
SFtV (sex detannihlng region Y}-boK 2 
gb:zx52e07.r1 SoaresJetaUivarjspleen^ 
cancer/lesfe antigen 
f^ried^e)ch ^sDda 

ESTs. W^8imilartoAF208855 1 B1IM1 

phospholipase C. beta 3. neighbor pseudo 

achaete-scute conpiex (l3rosoph]la) homo! 

delta(Orosophti^3 

HIV-IRevbbidhigprot^ 

tbynHd transcripfion factor 1 

Homo sapiens cDNA FU13207 fis. ciOne HT 

DMA replication factor 

cryptic gene 

ESTs 

ESTs 

ESTs, H^hiy similar to unnamed protein 

KIAA0251 protein 

metaiiofhionein IE (funcfional) 

U2 small nuclear flbontideoprolBln audi 

Homo sapiens cDNA: RJ21669 Ss, dbne H 

ESTs 

ESTs 

arytsulEataseO 
RAN binding proteffll 
ESTs 
tffin 

gban03o03jc1 Sirakagene sehizo brain 81 



1.69 


1.93 


1.00 


1.00 


1.19 


1.27 


4.90 


4.11 


199.42 


34aoo 


1.00 


moo 


1.00 


1.00 


3.44 


5.17 


^84 


&5B 


hypothetical protein ftifG(S629 


1,74 


1.60 


19^ 


1.00 


24.92 


7sm 


167.99 


66^)0 


1.00 


39.00 


1.97 


1.58 


63.89 


20.00 


20.41 


16.00 


7.09 


22.00 


29.75 


19iI0 


1.00 


1.00 


Z06 


1.81 


3.02 


131.00 


1.00 


131.00 


1.23 


1.02 


3Qi63 


171.00 


1.00 


1.00 


1.01 


1.45 


1.26 


1.11 


&33 




4.30 


7.82 


1.00 


1.00 


13.75 


103.00 


2Dai1 


1,00 


1.00 


1.00 


1.14 


1.10 


1.00 


4aoo 


16Z25 


189.00 


2.12 


1^ 


1.15 


1.94 


1.00 


1X0 


16.42 


84J00 


0.57 


1.76 


2.71 


4.15 


46.37 


47j00 


1.12 


1.35 


1.55 


2.51 


1.00 


55.00 


4.36 


3.18 


1.00 


1.32 


1.00 


22.00 


2.06 


1.88 


1.00 


1.00 


7.00 


9.85 


1.00 


3.00 


1.31 


Z01 


1.98 


1.71 


1Z60 


63.00 


1.00 


tJ» 


1.00 


237.00 
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4.58 



90.00 



TABLE 98 



Pkey: Unique Eos probeset idenfifler number 
CAT number Gene cluster number 
Accession: Oenbank accession numbers 



Pkey 
407746 



CAT Number 
10125J 



408070 


103668fiLl 


408660 


107294J 


409622 


113735 1 


409866 


1156522.1 


410032 


1170435.1 


411089 


12317^1 


411152 


1234028J 


412537 


1304.1 



AKD01962 !«9415 BE464605 AA418699 AA053293 AA149075 AA058396 AW3^226 AV\^2659 AM54607 AI139535AW469852Al^ 
AW271982 AA730033 AAS76607 AAg91217 AA782067 At985851 AA805864 AA50S598 AW469857 R69546 AA988279 AW001647 N83320 
D82661T27343AA306950AA360989l«B77a 
AW148dS2BE350895 

AAS25775 AA0S6342 Al538g78 AW9752B1 AM64g86 

AAD7S382AA075431 

AW502152 H41202H29772 



AA456454 AA713730 AA091294 AA584921 N86077 AW836781 AA601031 AA679876 AA551106 AA633188 AW90S577 Ai955808 A1679386 

AI679895AA514764AA454562AI082382AA595822AA551351AA586369AA666384AA1M934AA668398AA551297AW 

BBJe9139 AW936012 AW8774e6 AW819782 AW935798 AW835546 AW93e042 flE06912l AW835625 AW877536 AW935885 BEQ69202 

AW820019AW935937BE160180AVy935346BE069101 BE069125AyVB77527BE160316BE160398AW935794AW835701 AW935784 

AL031776 X5971 1 NIM.002505 M59079 Ai670439 Ar494259 AW664010 AA405083 AA438132 eE1745l6 AA412691 AI400314 AA436024 

T29403BE079412BE079428^i90322A1631202AiU1758Al016793AI167666A^88^075A^37S230AIZm 

AW953918AA927051AA889823BEOQ3094AW390155AW360805AW3e0823AW360810AA425472AI^^ 



167 



5 

10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



WO 02/086443 
412811 132943.1 



413690 1383256.1 
414883 15QZ4.1 



415989 156454.1 

417324 166710 

418574 17690.1 

418712 1784125J 

419443 184788.1 

419502 18535.1 



419936 
«1582 



422128 
423034 
423816 
424200 



426991 
427260 
428023 



429220 
429976 



430439 
430935 
431089 
431322 
432407 



189181.1 
2041.1 



211994_1 

224122.1 

23234J 

238595.1 

245835J 

273896.1 

27415J 

276598.1 



301384.1 
31150.1 



31808.1 
325772_1 
327825 t 
331543.1 
34624.1 



434414 38585.1 



436608 42361.3 



PCT/US02/12476 
AI478773 Atl 60445 A!674630 N69088 AW665S29 ^9278 All 29239 AI4578S0 AI821264 AW2971 52 AI2^^^ 
AI963541 AI469807 Ai969353 BE552366 N66509 AA736741 AA38255S AW075811 AW292026 

H06382AW957730AA352014R13591AA121201 060420 BE283253BS)47862Z41952A)424991 AI693S07AI863t08AA599060A]091148 

AA598669R39687AA8134KAVV016452HI)6383f»1807AI36426BAA620528A{241940A 

AA121202R17734 

BE157489BE157560 

AA92e980 AA9269S9 W76521 W24270 W21S26 AA037172 BE267636 H83188 AA4699Q9 N86396 AA001348 BE535736 AA081745 BE566245 
AA082436 H72525 H77575 N49786 W80565 H78746 B£Se9085 WD4339 R98127 T5S938 BE279271 AW960304 T29812 AA476873 B£2g7387 
AA292753 AA177048 NM.001826 X54941 BE314366 AAg08763 AI7ig075 BE270172 BE26g819 AAB89955 AI204630 W25243 A]9351SO 
AA872039 W72395 199630 AI4226gi H98460 N31428 BE255916 H03265 A1857576 AA776920 AA910644 AA459S22 AA293140 AW514667 
R759S3 AW662396 AA6^22 AI865147 A1423153 AW262230 AA584410 AA583187 AW024595 AW089734 A1828996 AA282997 AA876046 
AW613002 AAS27373 AWg72459 AI831360 AA621337 AA100926 AA772418 AA594628 AI833892 W9S096 AI034317 AA3g8727 AI085031 
N95210AI459432A]041437AA932124AA627684AA935829AI004827AI423513AI094597 H42079 R54703AI630^ 
AA643280 W44S61 At991988 A1537692 AI090262 AA740817 AI312104 AI91 1822 AA416871 AI185409 AA129784 AA701623 Aia75239 
A1139549 AA633648 A1339996 AI336880 AA399239 AI078708 AI085351 AI362835 AI346618 A1146955 Ai989380 AI348243 N92a92 AA7658S0 
AI494230 AI278887 AA962596 AI492600 W80435 AA001979 R97424 AI129015 N24127 AA157451 AA235549 AA4S9292 AA0371 14 AA1 29785 
AI49421 1 AWD59601 AWB86710 R3Z790 N59755 A1361128 AW569407 H47725 H97534 H4a076 H484S0 T99631 AW300758 H03431 R76789 
AA954344H77576R96823AI457100Nd2845N4S682 H42038BE2206986E220715H995S2AA701624 
H03266BE261919AA76g633AA480310AA507454AA910586AI203723AWia4725VV25611VV25071T6B980HO»^^ 
W95095 R97470AA702275 T77551 AA911952H82956 N83673AA283872 

AI267700 AI720344 AA191424 AI023543 A)469833 AA1 72056 AW958465 AA1 72236 AW953397 AA355086 

AW265494AA455904AA195677AW265432AW99160$AA456370 

N28754 N28747AI568146AI979338AA322671AA322672AVV95S043A1990326AA776406^^^ 

W70051 AI038748 AA831327 AI925845 AW945895 

242183 T31621T97478 

062703 AA242966 D79798 

AU0767O4 T74854T74860T72098 173285 T73873T69180 n4658TB8786T60385T73410T68781T678^ 
768367768401 T53959 T72360 T72099 T60377 T58961 T71712 T7282t T64738 T74645T72037 T68688 T72063 T7325B 172826764242 
T68220 T74673 T71B0O T68355 T61227 T62738 T69317 T53850 T64692 T7376B n3962 T73^ T68914 n0975 T73400 T60631 T73277 
T73203 T70498 T61409 T58925 NM_000508 M64982 T68301 T73729 T69445 T60424 T67922 T67736 T68716 T67755 T74765 U3819 T58719 
T74766 T60477 T74853 T61 109 T68329 T58850 T71 857 T73425 T53736 T68607 T58898 T64309 T72031 T72079 T64305 HI 908 T681 07 
T71916T73787T56035 T64425T71870T60476T61376T67620T71B95T41006T69441T6817^ 

H48353 T71914T53939T64121 AA6g3998T72S25T67779T68078AA011465AA345378AV654847AV854272AV656001 AI064740 T82897 
N33594 AA344542 AW805054 AI207457 T61743 AA026737 H94389 AA382695 AA918409 T68044 S82092 T39959 At017721 AA312395 
AA312919T40156 H66239 AV652989 H38726 R98521 AV655200 R95790 W03250 W00913 AA344136 AV660126 R97923 AA343596 
AW470774 AV651256 N54417 AA812862 AW182929 AIll 1 192 H61463 H72060 AA344503 H38639 AI27751 1 AV661108 AI20762S T47B10 
AA23S252 T27853 T47778 R95746 H70820 AA701463 AW827166 R98475 C20925 AV6572B7 T71959 T71313 T73920 T73333 T61616 T69293 
169283 T73931 T72178 T72456 AV645639 AV653476 n2957 T72300 T58906 T71457 T70494 T72956 ^ 

AA344726 T27854 T744B5 T741 01 T73868 T71 51 8 T72304 AA343853 T73909 T6B070 T72065 H721 49 T73493 T73495 AV645993 R02293 
T70475 T64751 AA344441 AA343657 AA345732 AA344328 All 1G639 AA344603 AF083513 T64696 T68516 T72223 T60507 T67633 R29500 
T72517 R02292 760599 T69206 T70452 T74677 R29366 T61277 T74914 T60352 R29675 T74843 AV645792 AA34440a 769197 772057 
T69388 769^ T6825BA\«429 773341 761702 774586740095 KQ22Z2740106AA343045AA341908AA341^ 
T53747 772042 762764 AI064899AAM3060 767832 772440 771770 768091 769108772449769167771289768251 AV654844 764375 
AA345234 767598 AA011414 768036 H48262 A1207557 TB8219 W86031 769081 764232 R93196 762136 AV650539 H67459 T72978 
AA3445B3 760362 H5B121 79571 1 772803 768055 ni71 5 R29036 772793 769122 764595 762888 769139 768291 764652 767971 T46862 
AA693592 AI248502 R29454 764764 757001 773052 771429 751 176 758866 AV655414 Kg0426 AA342489 773666 767848 77251 2 753835 
767837 773317774273 T69420 768245 774380767882774474 756068 
AJ792788BE142230AA252019 

A1910275 XD0474 X52003 XD5030 NM.003225 AA314326 AA308400 AAS06787 AA314825 AI571948 AA507595 AA614579 AA587613 R83818 

AA568312 AA614409 AA307578 A}925552 AW950155 AI9 10083 M 12075 BE074052 AW004668 AA578874 AA582084 BE074053 BE0741 26 

BE074140 AA514776 AA5B80348E074051 BE074068 AW009769 AW0S0690 AA858276 R55389 AI001051 AVU050700 AW750216AA614539 

BE074045AI307407AW6Q2303BE07357SAI202S32AA524242AI970B39AI909751 BE076078Aig09749RS5292 

AW881145AM90718MB5637AA304575 T06067AA331991 

AL1 19930 AA320698 AW7S2565 

AL03t985AL137241 A/792386AI733664 AI657654AI049911 

AA337221 AA336756AW966196 

AW953120 RS6325 AA34856Z 

AI493134AI498691 AW771S08 AI498457 AI768408 AI783624 AI383985 AI580267 D79613 AA393766 
AK001536 AA191092 AW510354 AI554256 AL353968 AA134266 
AA663848AA400100AA401424 

AL038843AA161338BE268213AA425597 N87306AA092g69BES66038AA247451N47392AI928802AW162584AVV^^ 
AI836994W5625B AI663448A1278611 AI2B3557 A!824306AW338658AW150899AA687514 N47393 N298B5AA973469At038904 AI292064 
AI034339 AW674S93 N721S6 AI079733 A1038683 AI291616 AM91599 AAg93675 AAB37380 BE006554 BE006473 AI087090 733044 
AA662043AI203503AAS83959VV35283AI129926 Z41844AVin20925AV\^5648AI684603AA493297^^^ 
A1932767 W02632 BE396786 R37261 
AW207206AW341473AA448195A1951341 

AA24g027 AU)38984 AK001993 AL080066 AV65272S BE566226 AA345557 AA31S222 AA090585 AA3756^ 

AW607939 H51658 D83880 N84323 BE296821 AW947Q07 061461 AW079261 AA3294B2 AW901780 AI354442AA772275 R31663A1354441 

AI767525H92431 AJ916735 H93575AJ394255AW014741 AI573090 C06195 AW612857AW265195At339558A)377532 AI306821 AI919424 

AI5B9705 AW0SS215 AI3^2 AI338051 AA806547 C75509 C00618 AW071 172 AW769904 AA630381 AI678018 A!863985 D79662 BE221049 

AW265018 AI589700 AW196655 N76573 AI370908 BE042393 N75017 AI698870 AW960115 

AL133S61 AL041090 AL117481 AL122069 AW439292 Aig88826 

AW072916 AI184913 AA489195 AW466994AW469044 N59350 AI819642 A1280239At220572 AA789302 AI473611 AW841126 D60937 
BE041395 AA49f 826 AA621946 AA715980 AA66S102 
AW970822 AA503009 AA5Q2998 AA5029B9 AA502805 792168 

AA221036 R87170 BE537068 BE544757 C1B935 AW812058 792565 AA227415 AA233942 AA223237 AA668403 AA601627 AW669639 

BE061833 BE000620 AVV981170 AVVB47519 AA308542 AW821833 AW9456B8 C04699 AA205504 AA377241 AV^ 

AW817S81AW856468AA155719AA17992B 703007 AW75429BAA227407AA113928AA307904C168S9 

AI7SB376 S46400AW81t617AW811616W00557BEM2245AVV858232AW861851AVVS5B362AA232351A^^ 

AV«B57541AW814172H66214Ami4398AF134164AA243093AA173345AA199942AA223384AA227092AA2270807^ 

761139 AA149776AA699829AVV879188AW61d567AW813538A1287168AA157718AA157719AA100472AA100774AA 

AA157730 AA157715 AA053S24 AWB49581 AW854566 005254 AW882B36 792637 AWB12621 AA206583 AA209204 BE1S6909 AA226824 

AI829309AVV991957N66961AAS27374He6216AA045664Aie94265 H60808AA149726AW195620BE08^ 

AW817705AW817703AW817659BG081631 HS9570 

AA628980At1266038E504035 
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438091 44964J AW373082 T55662 A)299190 BE174210 AW579001 H01811 W40186 R67100 AI923BB6 AW952164AA628440 AW8986a7 AW898616 

AA709126 AW898S28 AWBg8544 AA947932 AW898625 AW898822 AI276125 All 85720 AW51 0698 AA987230 T52522 BE467708 AW243400 

AW043642AI288245 AI186932 D52654 D55017 052715 D52477 D53933 D54679 AI298739 A11 46984 AI922204 N98343 BE174213 AA845571 

AI813854At214518AI635262AI139455AI707807AI6g8085AW8B4528Al02476BA1004723AVVOB742DAI 

AW513280 AI061 126 A)435818 A)859106 A)360506 A!024767 AAS13019 AA757S98 XS6196 AAg029S9 AI334784 AI860794 AA010207 

AW890091 AW513771 AI951391 A)337671 T52499 AA890205 AI640908 H75966 AA463487AA3586B8 Atg61767 AI868295AA780994 

Aig65gi3BE174196AA029094AW592159TB5581 N79072AI611201 AA910B12AI220713AW149306AI7S^^ 

439000 467716J AW979121 AA847986AA82g0g8 

439285 47065J AL133916 N79113AF086101 N76721AW950B28AA384013AVV955684AI346341 AI867454N54784AI655Z70AI421279AVVD14882 

AA775552 N82351 N59253AA626243At341407BE175639AA456g68AI358918AA4S7077 
4397B0 47673J ALt09688 R23665 R26578 

441128 S1021_? AA570258 AW014761 AA573721 A)473237 AI022165 AA554071 AA127551 N90S25 AW973623 AA447991 AA243a52 BE32B850A114B171 

A1359627AI005068 AI356567 AA232991 AW016855AA906902AA233101 AA1275S0 BE512923 
443068 5S8B74_1 AI188710AI032142AW078833 N30308AW675632AI21902BAI341201 N22181 H95390 
443947 586160J W24187 W24194R17789 

447636 7301.1 Y10043 Nh^005342 L05085 AL034450 BE614226 AW749053 AA37gi73 AA248230 BE5t4634 AA334622 R70656 AA367593 AA214649 

AA369318 AW957081 R05760 AA039903 AI8B6597 AW630122 AAg06264 AA041 527 R01145 Ai0886B8 BE463637 AA398795 Aj354863 
AI768938AI569996AI452952A1168582AI189869A1086670AW262560AW613854AA&62839AA435840AA670ig7A102 
A1990C69 N810g5 AAB47919 AW960150 AA21 1075 AA044704 AA367594 AW582S87 AW8588S4 AW818630 AW818281 AW818433 AW582S95 
AA096002N83992 

448993 79225_1 AI471630 BE540537 BE265481 AW407710 BE513882 BE546739 AA053597 BE140503 BE218514 AW9S6702 AI658234 AI636283 AI567265 

AW340a58 BE207794 AA053085 R69173 AA292343AA454908 AA293504 AI659741 Ai927478AA399460 AI760441 AA346416 BE047245 
AA730380 AA394063 AA464833 A1982791 AIS67270 AI813332 AI767858 AA427705 D20284 AI22145B BE048537 AI26304eAA346417 
AA911497BE537702 

449305 B04424J AI638293AW813561 

451105 859083J A1761324AW8B0941 AW880937 

451320' 8B576J AW1 18072 AI631 982 T15734 AA224195 AI701458 W20198 F26326 AA890570 r«0552 AW071907 AI671352 A137S892 T03517 R88^ 

A)124088 AA2243eB AI08431 6 A1354686 T33652 A)140719 AI72021 1 703490 AI372637 T15415 AW205836 AA6303&4 
AA017131 AA443303T33623 AI222556T33511 T33785AI419605 D55612 

451807 ; 8865J W52854AL117600BE208116BE208432BE206239BE082291 AW953423AA351619BE180648BE140560 W60080AAB6 

AW450852 AW44951 9 AA993634 AI806539 AA351618 AW449522 AI827626 AA904788 AA380381 AA886045 AA774409 BE003229 Z41756 

4S2410 9163.1 AL133619 AA468118 AA383084 A)476447 T09430 A1673758 AA5248g5 AI581345 AI300820 AW498812 AA256162 A)559724 AI685732 

AA602400AA905453AI204595AW1 66541 AA157456AA156269AA383652AA431072AW592707AI435410AVV272464AI2^ 
R74039 N3S031 AI804128AW513621 AA868351 AI026826AI49338B AA614641 W81604 Al55706a AI214351 AA730140AI1257S4AI200B13 
A1269603 AI565082 A)80709S AI476629 AA505909 AI368449 AI686077 A(582930 AW085038 AA757863 AA730154 AI767072 AA468316 
A1734130 AI734138 AA426284 AA433g97 AI741241 AW043S63 A1732741 AI732734 AA437369 AA425820AA664048 R74130 

454241 1087807.1 6E144666BE1 84942 AW238414BE184846 

455175 1257335J AW993247AW861464 

456237 168730.1 AA2036B2R11958 

458098 47385J BE550224AA832519IW5402AVV885B57N29245BE455409VVD7677AVV970089AI299731 AA462971 BE503548H18151 W79223AF086393 

AA461301 W74510 R34182 AI090689 N46003 BE071550 R28075 AW134g82 AI240204 AI136906 AW026179 AI572316 BE466182 AI20638S 
AI276154 AI273269 AI422817 AI37iai4 A1421274 Ai188525 AA939164 BE549810 AW137865 AI694896 BE503841 AA459718 BE327407 
BE467534 BE218421 BE467767 AA989054 BE467053 A1797130 6E327781 



TABLE 9C 

Pley: Unique number corresponding to an Eos piQbeset 

Ref: Sequence source. The7d[^UnunibersinlhiscoIUnmareGenbanklden1ifier(6Qru]n^ 1)unhaml.etal.'refefstothepubticalionei^ 

sequence of human chromosome 2Z' Dunham 1. et aL, Nature (1999) 402:46949& 
Strand: lndh»tesO^MstraIKjfomtftMdl€D»nsM8reprBdft^ 
NLpodtion: IndkatasnucMdeposUtonsofpredW 





Ref 


Strand 


NLposition 


400512 


9796593 


Minus 


1439.1615 


400517 


9796686 


Minus 


49996^0346 


400560 


9843598 


Plus 


94182-94323,97056^43,101095-101236»102824-103006 


400664 


8118496 


Phis 


13558-13721,13942-14090,14554-14679 


400865 


8118496 


Pius 


16879-17023 


400566 


8118498 


Plus 


17g82-18115,20297-20456 


400749 


7331445 


Minus 


9162-9293 


400763 


8131616 


Minus 


'35537-35784 


401027 


7230983 


Minus 


70407.70554,71060-71160 


401093 


8516137 


Minus 


22335-23166 


401203 


9743387 


Minus 


172981-173056,173868-173928 


401212 


9858408 


Plus 


8783&-88Q28 


401411 


7799787 


Minus 


144144-144329 


401435 


8217934 


Minus 


54508-55233 


401464 


6662291 


Minus 


170688-170834 


401714 


6715702 


Plus 


96484-96681 


401747 


9789672 


Minus 


118896-118816.119119-119244,119609-119761J20422-12099ai30161-130381.130468-130593J3l097.1^ 


401760 






131932,132451-132575^133S80-13401 1 


9929699 


Plus 


83126-83250,8532045540.94719-95287 


401780 


7249190 


Minus 


28397-28617^920-29045,29135-29296.29411.29567.29705.29787,30224^0573 


401781 


7249190 


Minus 


83215-83435,83531-83656,83740^01.84237-84393.84955.85037.86290-86814 


401785 


7249190 


Minus 


165776-165996.16618&.166314,166408.166569,167112-167268,167387.167469.168634-168942 


401797 


6730720 


Phis 


6973-7118 


401961 


4581193 


Minus 


124054-124209 


401985 


2580474 


Phjs 


61542-61750 


401994 


4153858 
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TABLE 1QA: PotenSal Therapeutic Diagnosfc and Prognostic targets for Ttterajiy of Lung Cancer and NorHnaBgnant liing Disease 
Tabte 2A shoiNS about 307 genes UfHegu)^ hi non-rnalignant lung 
norma! lung and non^na&gnam hnig disease. Tttese genes vvere se^ 

Table 106 show tiie accession numbers for tiiose Pke/s lacking UnigenelO's for table m For each probesel we have feted the gene duster number from which the 
otigonudsotideswBiB designed. Gene dusteiswerB compiled using sequences derWed from Gen Tbese sequences were chistaredtiased on sequence 

simflari^u^GbjMngandAlignmentToolspoubie The Genbaitkaooessionnumbemforeequencesoomprising each (toter are BsM 

'AccessiorTcohmin. 

Table lOCshow the genomic positioning for those Plce/s laddng Unigene ID'S and accession numbers In table 10A. For each predicted exon, we have iistsd the genomic 
sequence BQUfce used for pre^t^. Nucteofide tocalioiB of each predicts 



Rcey: \ Unique Eos probeset idenSRer number 

BxAoca: Exemplar Accession number, Genbanic accession number 

IMg^D: Unfgene number 

lli4gene TtBe: Unigene gene 1^ 

R1: Average of fung tumors (indudJng squamous ceil carcinomas, adenocarcinomas, sm^ ceO cardnomas, granukunatous and cananoid tumors) dhnded bjf the 
average of norma) hing samples 

R2 Average of non^naTignanl lung disease samples (including bronchitis, emphysema fibrosis, atelectasis, asthma) divided by the average of nonnai hmg samples 
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Hs.31141 


Homo sap»ns mRNA for KIAA1566 protein. 


409031 


AA376836 
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Hs.167793 


ESTs 


412351 


AL135960 


Hb.73828 


T-ce9 acuta ly^^Aocyticleutemia 1 


412372 


R65998 


H&J85243 


hypothetical prc^n FU22029 
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Pkey: Unique Eos probeset Mentiiier number 
CAT number. Gene duster number 
Aocessfon: Genbank accesskm numbers 

Pkey CAT Number Accession 

408074 103884 1 R20723AA263003AA333976AA334725AA334151AVV965490AA310513AI810530 031302 Alrt/t348S7AABa^^^ 
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BE1467g7 BE146776 BE146985 BE146793 BE146768 BE146771 BE146954 BE146760 BE147048 BE147025 BE147030 

423387 22779 1 AJ012074U11087L13288 X75299 U0295AW630780H14880n8037 AI872991 R72136AW449839T81522T79697T29519 R94105 T83823 
R73300 AI797007 R73390 AAS61010 H741 68 AI689932 BE045543 AI808416 AI608912 AI808573 AW8840B4 AW872978 AW872985 AA56S655 
AI022915 R50647 R73210 H45098 R46451 AW166289T71132 AI264547 R52146AI304920 R73391 AW884059 AW884085 H73241 T60038 
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TABLE IOC 

Pkey: Unique number corresponding to an Eos probese! ^ 

Ref: Sequence souroe. The 7 digit numbers in this column are Genbank Identifier (GD nunfteis. Dunham I. et al.' reteis to the puUkalion en W Tlte DNA 

sequence of human chromosome 22.' Dunham I. et aL, Nature (1999} 402:489-495. 
Strand: tedfcatesM strand fiomwfaidi axons were predicted. 
NtJ)osition: Indicates nudeoBde posifions off predicted exons. 
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Table IIAshoiMS about 64 genes upregulated hi lung adenocarcinomas relaSva to other lung tumocs. nohm^nant lung disease, and nomiat tting. These genes were seteded 
fipom aboiit 59680 piabesets on the Eos/Alfynrairix Hu(Q Genec^ 

Table IIBahow the accession numbers for those Pte/s lacMng UnlgenelD's Ibr table 11A. For each probesetvwe hawBsted the gene duster number from wWdi the 
oligonucleotides vfere deseed. Gene dusters were compiled using sequenoes derived MmiGenbankCSTs and rnRTMs^ These sequences were cbistered based on sequenoe 
similarity usbg Qustering and ABgnnentTadb (Douto^ 
'Accession' cduinn. 

Table 1 1C show the genomic positioning for those Bey's laddng Unigene ID's and accession numbeis m table 11A. For each predicted exon, we have fisted the genomic 
sequence source used for prediction. Nucleotide bcalions of each predicted exon are dso&sted. 

PIcey: Unique Eos probeset Identifier number 

ExAocn: Exemplar AocesskmmiiTiber,6enbankaoce5Sionnund)er 

UntgenellK Unigene number 

Unigene Wen Unigene gene title 

R1: Average of lung tumors fmcluding squamous oel caidnomas, adenocardnomas. small ceD cardnomas, granutamatous and cardnoU taflnors) dhAied fay the 

average of nomnailung samples 

R2: Average of noiHnallgnant lung disease sanvies(inctudingbronchils.eniph^^ 
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Hs.1674 


g}utamlne*^udose-6*filho6phaletransam3n 


1.00 


50.00 


424502 


AF242388 


Hs.149585 


lengsin 


1.00 


1.00 


424544 


MB8700 


Hs.150403 


dopadecarixDQflase (aromatic L^amino ad 


i4n 


59XK) 


424905 


NM_002497 


Hs,153704 


NI]\/!A (never bi mitosis gene a>fBtatBdk 


21.35 


1.00 


424S60 


6E245380 


Hs.153952 


9 nucleotidase (CDTS) 


1.00 


1.00 


425523 


AB007948 


Hs.158244 


K]AA0479proteb 


1.00 


35.00 


426230 


AA367019 


Hs^41395 


protease, serine. 1 (trypsin 1) 


1.00 


83.00 


427701 


AM11101 


Hs^43886 


nudear autoanligenic sperm protein (his 


7.41 


34.00 


428585 


A6007863 


Hs.185140 


KtAA0403 protein 


1.00 


6.00 


428758 


AA433g88 


Hs.98502 


hypothetical protein FU14303 


1.06 


1.13 


429170 


NM_001394 


Hs.2359 


dual specffidty phosphatase 4 


1&1B 


105.00 


429263 


AA0ig004 


Hs.198396 


ATP-binding cassette, sub^ily A (ABC1 


1.07 


m 


429610 


AB024937 


Hs^11092 


LUNX protein; PLUNC (palate lung and nas 


1.59 


1.69 


430508 


AI015435 


Hs.104637 


ESTs 


4.75 


7.27 


430985 


AA490232 


Hs^23 


ESTs, Weakly similar to 176885 aerinsfth 


0.94 


1.28 


431548 


AIB34273 


Hs.9711 


novel proteb 


5.66 


154X) 


431566 


AF176012 


Hs^0720 


J domain contaldng protein 1 


49.76 


ZTM 


431986 


AA536130 


H8.149018 


Novel human gene mapping to chomosome 20 


1.19 


1.47 


432375 


BE53^69 


Hs^62 


SlOOcaidum-blnding protein P 


1.65 


1.06 


432677 


NiyL004482 


Hs^8611 


UOP-N'acetyl^pha>&galadosamlne:polyp 


1.00 


46.00 


433556 


W56321 


Hs.111460 


caidum/baimodulin-dependent protehi kin 


1.00 


19.00 


433819 


AWS11G97 


Hs.112765 


ESTs 


3.71 


aoo 


434001 


AW950g05 


Hs.3697 


serine (or cysteine) proteinase tohlUto 


29.31 


72.00 


434424 


AI811202 


Hs.325335 


Homo sapiens cONA: FIJ23523 lis, done L 


1.00 


84.00 


434792 


AA649253 


Hs.132456 


ESTs 


8.S2 


UJK 


436217 


T53925 


Hs,107 


fibiinogen^ike 1 


67.97 


31.00 


436749 


AA584890 


Hs.5302 


lectin. galactosUe-bihding, soluble, 4 


1.10 


1.41 


436972 


AA284679 


Hs.25640 


daudinS 


1.59 


1.46 


437866 


AA156781 




metall(^ioneln IE (functional} 


3.62 


101.00 


437935 


AW939591 


Hs.5940 


mudn 13, epithelial transmembrane 


1.60 


1.39 


438915 


AA280174 


Hs.285681 


Williams-Beuren syndrome chromosome rag) 


1.00 


1.00 


439451 


AF086270 


Hs.278554 


h^fochramaBrvQte protdn 1 


23.28 


52.00 
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n5.D/ fwf 


441001 


All (Uuln 


nS.fD40 


AA^vn 
44idi/ 






AA^CAA 

44o0l4 


AVOOOOOD 


rls.(04D 


44o6i3 


AAoroorZ 




443991 


NM_UUZ250 


riS.iUOo£ 


444670 


H58373 




444SWI 


A trance 




446102 


AlAMAOnc? 


Ue ^17fiQA 
riS. J 1 /Dm 


446163 


A AniCQfin 
AAU^OOoU 




AAHARQ 
440405) 




Ue -IRII^ 

ns.i3iio 


447368 




Hs,76277 


447532 


AK000614 


H5.18791 


448243 


AW369ni 


Hs.52620 


448844 


A1581519 


Hs.177164 


449444 


AW818436 


Hs^3590 


451607 


W52854 




452689 


F33S68 


Ns^84176 


453392 


U23762 


Ks.32964 


453464 


A1884911 


Hs.32989 


453735 


AI066629 


Hs.125073 



Homo saplsns mRNA fiiO length insert cDN 

i}rino9en, B beto po^pftte 

EST8 

fibrinogen, B beta polypeptide 

Honflo sapiens mRNA; cONA OKFZp667D095 (fr 

potassium interniediate^smalt conductance 

hypothetical protein M6C5370 

genera) transcripfiui factor RIA 

ESTs 

Homo sapiens cONA FLJ13603 lis, done PL 
homogentisate 1 ,2-tfuxygenase (homogenii 
Homo sapiens, done M6C938)t mRNA. comp 
hypothefica! protein FLJ20607 
integrin, t>eta 8 
ESTs 

solute carrier fiamUy 16 (monocarboxytic 
hypothetical protein FU23293 simlar to 
transferrin 

SRY (sex determlnbig region YHmx 1 1 
receptor (eaUlonin) acfiy^ moii^ 
ESTs 



1.00 


21.00 


1.41 


99.00 




1.00 


1.00 


16.00 


1.20 


1.89 


5.71 


6.87 


1.98 


38.G0 


i.oa 


54.00 


1.00 


1.00 


1.00 


36.00 


1.00 


11.00 


1.24 


1.16 


1.23 


1.63 


15.84 


1.00 


1.00 


31.00 


1.00 


83.00 


1.55 


35.00 


1.54 


1.44 


1.00 


16.00 


1.55 


^45 


1,01 


1.30 



TABLE 11B 



Hey; Unique Eos probeset idenSier number 
CAT number Gene duster number 
Accesstorti Genbank accesdon nurnbers 



rnCBy On I IWfnDBl fVjUDavnvrff 

410399 11995 t BE068889 BE068882 AF044311 AF017256 NO03087 AF037207 AF010126 AA633976 AA872836 BE298825 BE299a89 AI016464 AI6B4600 

AI936527 AA804675 AA394097 AI139933 AA946606 BE171313 AA722407 AA293803 AI468480 AA056035 AA055968 AW796957 AI637713 
AA410737 H49348AA486472 AA411094AA235594AA402624 AA443638 AW452137AA421708AW265211 AI493266AA365132AVV966044 

419502 18535 1 AU076704 T74854T74860T72098T73265T73873 T69180T7465BT58786 T60385 T73410TBB781T67845T67593 T73952T6786^ 

T68367T68401 T53959T72360 T72099 T60377 T58961 T71712T72B21T6473BT74645 T72037 T68688 T7a)63T7325BT72826T64242 
T68220 T74673 HI 800 T6B355 T61227 T62738 T6931 7 T53850 T64692 T73768 T73962 T73382 T6891 4 T70975 T73400 T60631 T73277 
T73203 T70498 T61409 T58925 NiyL000608 M64982 TB8301 T73729 T59445 T60424 T57922 T67736 T68716 T67755 T74765 T73819 T56719 
T74756Te0477T74863T6l109T68329 T58B50T71857T73425 T53736 T68607T58898T64309 T72031 172079 T64305n^ 
T71916 T73787 T56035 T64425 ni870 T6D476 T61376 T67820 171895 T41006 T69441 T68170 T74617 ni958 T69440 T61875 R06796 
H48353 T71914 T53939 T64121 AA693996 T72525 T67779 T68078 AA01 1465 AA345378 AV654847 AV654272 AV656001 AI064740 T82897 
N33594 AA344542 AW805054 AI207457 T61743 AA026737 H94389 AA382695 AA918409 T68044 S82092 T39959 AI01 7721 AA312395 
AA312919 T40156 H6G239 AV652989 H38728 R98521 AV655200 R95790 W03250 W00913 AA344136 AV660126 R97923 AA343598 
AW47a774AV651256N54417AA812862AW182d29Ai111192H61463 H72060AAd44503 H3B539AI277511AVK^ 
AA235252 T27853 T47778 R95746 H70620 AA701 483 AW8271 66 R98475 C20925 AV657287 T71959 T7t31 3 T73920 T73333 T6161 8 T69293 
T692B3 T73931 T72178 T72456 AV645639 AV653476 T72957 T723Q0 T68906 T71457 T70494 T72956 T70495T68267 T74407 T85778 
AA344726 727854 T7448ST74101 T73868T71S18T72304AA343853T73909T68070 T720S6H72149 773493 T73495AV645993 R02293 
T70475 T64751 AA344441 AA343657 AA345732 AA344328 All 10639 AA344603 AF063513 T64696 T68516 T72223 T605Q7 T67633 R29500 
T72517 R02292 T60599 T69206 T70452 T74677 R29366 T61 2n T74914 T60352 R29675 T74843 AV645792 AA34440B TK^ 
TB9368 T6935B T68258 AV650429 T73341 T61702 T74598 T40095 K02272 T40106 AA343045 AA3419M AA341^^ 
T53747 T72042T62764 AIC64899AA343060T67832 T72440 T71770T58091 T69108T72449T69167 T71289T68251 AV654844 T64375 
AA345234T67598 AA011414T68036 H48262 AI207557 T58219 W86031 T69081 T64232 R93196T62136 AV650S39 H67459T72978 
AA344583 T60362H58121 T95711 T72803 T68055T71715 R29036T72793T69122T64595T62888Te9139T68291 T64652T67971 T46862 
AA693592 AI248502 R29464 T64764 T57001 T73052 T71429 T51 1 76 T158866 AVB55414 H90426 AA34Z489 773666 T6784B T72512 T53B35 
T67837 T7331 7 T74273 T69420 T68245 T74380 T67862 T74474 T56068 

421582 2041 1 AI910275 X00474 X52003 XD5030 NM 003225 AA314326 AA308400 AA506787 AA314825 A1571948 AAS07595 AA614579 AA587613 R83818 

AA568312 AA614409 AA30757B AI925552 AW950t55 AI910083 tyl12075 BE074052 AW004668 AA578674 AA582084 BE074C53 BE074126 
BE074140 AA5U776 AA588034 6E074051 BE074068 AW009769 AW050690 AA858276 R55389 AI001051 AWDS0700 AW750216 AA614S39 
BE074045 Ai307407 AW602303 BE073575 A1202532 AA524242 A(970839 AI909751 BE076078 A1908749 1^92 

437866 44433^ AA156781 AW293839 U520MAA024963AA778446BH)73977AW444904AW602574BE164040BE164012BE163972BE163974BE1639^ 

AA83748t AW468444 BE185091 AW468002AA687333 AA81 1830 AA58 1806 AI866686AI5721 24 AA043777AA040926 D20160AI536733 
AA812489 AW874142 Ai471883 W84421 AA156850 

451807 6865 1 VV52854 AL1 17600 BE2081 16 BE20B432BE2Q8239BE082291AVV953423AA351619BE1 80648 BE140560We0080AA8^^^ 

AW450652AVV449St9AAg93634AI806539AA351616AVV449S22AI827&26AA904788AA380» 



TABLE lie 

Pfcey: Unique numbercorresponding to an Eos probes^ 

Ref: Sequence source. The 7 digit numbers in this column aie Genbank MenGfier (GQ numbers. 'Dunham I. et al." refers to the pubGcaiton enSfled The DNA 

sequence of human diFomosome 22' Dunham I. el aL, Nature {1999} 402:48»49& 

Strand: Indicates DNA sb-and from which exons were predided. 

NLposifion: Indicates nucleotide positions of predicted exons. 

Pkay Ref Strand Ntposition 

403329 8518120 Phis 96450^8 

406399 9256288 Minus 63448^4 
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TABLE 12A: Genes OisOnguishuio Squamous Cdl Caidnomafi^ 

Table 12AshoiNs about 72 genes upregiilated in squamous oaO caninomas of Siahmg rslaOve to other lung tumofs* non-nolgnant lung disease, and normal lui^. These genes * 
wsre selectadtnMn about 59680 prebeseb on the EosM^metrix Hu03 GenecMp angy. 

Table 12B show the accession numbecs for those Pke/s lacking Unigenel[7s ttarlaUe 12A. For each profaesetwa have Dstad the gene duster number from which the 
oligonucleotides were designed. Gene dustais were complied using sequences derived ftam Genbank ESTs and ntfVlAs. These sequences were dusteied based on sequence 
sImHarity using ChistaAig and AQgnmentTods (DoufaleTwist, Oakland Caiifamia). The Genbank acces^n numbers for sequences comprising each duster are Isted In ihs 
"Acoesston'odumn. 

Tabte12C show the genomic posltbningibrfhosePkey^teddngUnigene 11)'^ Fbreadipredictodexon. we have feted the genomte 

sequence source used for predictbn. Nucteotidetocatbnsofeach predicted axon are daoBsted. 



Pl(ey: Unique Eos probeset ktentifter number 

ExAocn: Exemplar Accession numbar. Genbank accession number 

UnigenelD: Unlgene number 

Unigene Titte: Undone gene title 

R1: Average of lung tumors Caiduding squamous cell cardnomas, adeno ca rcino mas , smal ceO caidnomas, granulomatous and cardndd tumors) dhrided Iqr the 
average of norma) iung samples 

R2: Averageof non-malignant hmg disease samples Cmdudteg bronchitis, emphysema fibrosis* atetedasis, asthma) dhdded by the average of nornid lung samples 



R(0y 


ExAocn 


UnigenelD 


Un^ene Tifle 


R1 




400259 


X07620 


Hs.2258 


matrix metaOoproteinase 10 {stromdysin 


132.45 


4.00 


400666 






NK/L002425:Hotio sa|riens matrix metallopro 


3.26 


a22 


401780 






NIUL005557*:Momo s^ilens herdin 16 (fbca 


2aA7 


IOlSO 


401781 








10.33 


4.61 


401785 






Nii/L002275*':Homo sapiens keraSn 15 (KRT 1 


4.13 


2.70 


401994 






Ta^et Exon 


61.84 


47,00 


402075 






ENSP00Q0Q2S105fi*'nasiiia immhram caldum 


1.00 


1.00 


404996 






Targd Exon 


1.00 


1.00 


407B39 


AAQ4S144 


Hs.161556 


ESTs 


17191 


108.00 


dnnnnn 

tuouuu 


LI 1690 


Hs.620 


hiitlfliis nAinohianM nnfinnn 1 r230/240lcD) 


151.17 


&00 


408522 


Al 541 214 


Hs.46320 


Small nmfirtA-rich iun)<^ SPRK fhifmsn 


1.98 


1.24 


410561 


RP<i402SS 


Hs.6994 


Homo sapiens cDNAi RJ22Q4^ fiSt dom H 


10.04 


1.00 


415091 


AL044672 


H.4 77<)1Q 


"Uhvrtrnw-vVnTfllhvffiliiiaTvi-nnflnTvrnfi A sv 


1.00 


30.00 


415817 


U88967 


rio»r v(Mlf 


UlUlvlll IJI^OIIIO pilvOyi lOlOaO; lOvOpOM 1 


24.30 


i,(io 


416658 


U03272 


Hs.79432 


fii)rill]n 2 (cor^enild conlractural ara 


53.29 


51.00 


11 ruj*» 


MU flflRIQQ 






1.00 


1.00 


417366 




Hs.1076 


srnaD prdlnoHiich protein IB (cordfin) 


8.97 


317 


IIOOOO 


Aknfliinn 

MIMJU 1 1 UU 


nSiH IDSU 


flaenvw^llln ^ 

UOStlllUlAUUII w 


112.17 


19.00 


1100/0 




Mc R797R 


cancer/lesSs antigen 


1.18 


1.10 


419121 


AA374372 


UeftQMA 
na<090£w 


paratiiyroid Iiofmonfr^ilo hormone 


1.00 


1.00 








inrtin natantfwMaJtfndlnfl mliiMiL? 


3.04 


1.25 


421773 


W59233 


Hs 119457 


ESTs 


l!l2 


1.14 


421948 


L42583 


Ms ^34^ 

rK>>ww*lvv9 


kpmtin fiA 


51.83 


20.25 


421978 




^ 110186 


Nir^-I omlfdn 


1.01 


0.91 


422158 


LI 0343 


Hs 112341 


nrnteasR inhlMior 3 fikTrwlprivad fSICAL 


2.37 


1.10 


422440 


MM OAlfil? 


Hs 116724 


fllrfft-kato radiietiKA famDv 1 miunhar BIO 


47.53 


32.00 




AUUQCOQAA 
nVT«K)39U0 


ns. iwu 


KAnflnn-nfnniitn ftmuitti fa^nr nin/llftn IMT 

ncpanrrOinuinj} gruwui laciur iflnuuiy ^ 


76.02 


1.00 






Ue 1^77 


hvftnihAfiral nrntpin 1 C)C^7f09 


4.20 


1,00 


423738 


AB002134 


Hs.132195 


airway trypsin-Iike protexe 


10.14 


51.00 


424012 


AW368377 


Hs.1 37559 


tumor protein 63 kDa vrith strong homdog 


233.42 


66.00 


424046 


AF027865 


Hs.1 38202 


serine (or <^tdnd] protdnase InhiUlo 


1.00 


1.00 


424098 


AF077374 


Hs.139322 


small proSne-iich protein 3 


137.82 


54.00 


424834 


AK001432 


Hs.153408 


Homo sapiens cDNA FU 10570 fis, done IfT 


56.19 


1ZO0 


425650 


NM.001944 


Hs.1925 


desmogldn 3 (pemphigus vulgaris antigen 


33.45 


1.00 


427099 


ABQ32953 


Hs.173560 


odd Oz/len-m homdog 2 (Orosophila mous 


4.24 


17.00 


427335 


AA44a542 


Hs^1677 


G antigen 7B 


51.83 


4.Q0 


428182 


BE386042 


Hs.293317 


ESTs. Weddy similar to G6C1 JIUMAN G ANT 


1.00 


1.00 


428645 


AA431400 


Hs.98729 


ESTs, WeaMy similar to 2017205A dihydro 


1.00 


16.00 


428748 


AW593206 


Hs.98785 


Ksp37 protein 


1.00 


87.00 


429259 


AA420450 


Hs^2911 


ESTs, Nighty similar to S6071 2 band^ 


Z01 


1.18 


429538 


BE182592 


Hs.11261 


sn»ll prdlne-rich protein 2A 


4.43 


Z90 


429903 


AL134197 


Hs.93597 


cydinKlependent kinase 5, regdatoiy su 


11.80 


1.00 


430486 


BE062109 


Hs^41551 


ditoride charmd, cddum acBvatadi fsm 


12.28 


41.00 


430890 


X54232 


Hs.2699 


giyplcani 


1.58 


1.40 


431009 


BE149762 


Hs.48956 


gap junction protein, bete 6 (connexln 3 


60.25 


28.00 


431846 


BE019924 


HS.2715B0 


uroplakln IB 


4.49 


2.51 


433091 


Y12642 


Hs.3185 


lymphocyte antigen 6 complex, kxxis D 


1.20 


1.09 


434360 


AW015415 


HS.1277B0 


ESTs 


40.98 


27.00 


4348B0 


U02388 


Hs.101 


cytochrome P450, subfamiiy IVF, pdypept 


1.00 


1.00 


435505 


AF200492 


Hs^11238 


interleukin-1 homdog 1 


1.00 


38.00 


435793 


AB037734 


Hs.49g3 


KlAA1313protdn 


23.68 


4ZO0 


436511 


AA721252 


Hs^1502 


ESTs 


16.76 


14,00 


438403 


AA806607 


Hs.292206 


ESTs 


100 


1.00 


439285 


AL133916 




hypothetical protdn FU20093 


46.23 


139.00 


439606 


W79123 


Hs.58561 


G protein-coupted receptor 87 


33.61 


1.00 


439670 


AFD88076 


Hs.59507 


ESTs, Weddy dmSar to AC004858 3 U1 sm 


1.00 


1.00 


439706 


AW872527 


Hs.59761 


ESTs, Weddy dmiter to DAP1 JIUiMAN DEATH 


86.55 


11.00 


440325 


NM.003812 


Hs.7164 


a dlsintegiin and metaHopratdnase donm 


62.88 


147.00 


441525 


AW241867 


1^127728 


ESTs 


1.53 


1.42 


443162 


T49951 


H8.9029 


DKFZP434G032 protein 


31.11 


38.00 


444378 


IM1339 


H8.12S69 


ESTs 


too 


too 
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446292 


AF081497 


Hs.279682 


Rh type C glycoprotein 


447078 


AW885727 


Hs.9914 


ESTs 


447342 


A1199268 


Hs.19322 


Homo sapiens, Similar to RIKEN cONA 2010 


449003 


X76342 • 


Hs.3Bg 


alcohol dehydrogenase 7 (dass IV), mu o 


449101 


AA205847 


Hs.23016 


G protsin-couptol leoeptor 


450832 


AW970602 


H5.105421 


ESTs 


452240 


A1591147 


Hs.61232 


ESTs 


453317 


NM_002277 


Hs.41696 


keratin, hair, addicl 


453830 


AA5342S6 


Hs.20953 


ESTs 


454098 


WZ7953 


Hs^2911 


ESTs, Highly Mar to S60712 barul-6-pr 


455601 


A1368680 


Hs.816 


SPY (sex detsimining region Y)-box 2 



1.55 


1.26 


47.24 


24.00 


28.63 


1.00 


1.00 


too 


2^ 


27.00 


25.17 


36.00 


13.42 


1.00 


1.19 


1.27 


24.92 


25.00 


1.26 


1.11 


206.11 


too 
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TABLE 12B 

Pksy: Unique Eos probeset ideidifiBr number 
CAT number Gene duster number 
Acoesskm: Genbank accession numbers 

Ptev GAT Number Accession 

43^ 47085 1 AL133916N79113AF086101 1^6721 AW95082BAA364013AVW566B4AI346341AI867454 N547B4AI655Z70AI421279AVV014^ 

AA7755S2 N62351 NS9253 AA626243Ai341407 BE17S639 AA456968 AI35691B AA457077 



TABLE 12C 

Pk^ Unique number conesponding to an Eos probesel , ,* ^ ^ j«-u^f,«* 

Rah Sequence source. The 7 digit numbers in this cduran are GenbankWertffler(GI)nuf*er6. T)unh8mLel8L're!w5tothepubficalkmeBtitJed"TheO^ 

sequence of human chromosome 22." Dunham I et al., Nature (1999) 4(^489495. 

Strand: Indicates DMA strand from vdifch axons were predldad. 

NLposIilDn: IndledBsnudeoSdefMislGQiis of predicted exons. 

Pfcay Ref Strand NLposition 

400668 8118496 Plus 17982-18115,20297-20456 

401780 7249190 Minus 28397-28617,28920-29045.29135-29296,2941 1-29567.2970S-29787,30224^0573 

401781 724910) Wnua 8321&«435.83531-63656,8374{M3901.B4237-84393,8495M5037,88290^814 

401785 7249190 Minus 165776-165996.166189-166314.166408-166669,167112-167268.167387-167469.168634-168942 

401994 4153858 Minus 4290443124,4321143336,44807-44763.4519^5281.46337-48732 

402075 8117407 Plus 121907-122035.122804-122921,124019-124161,124455.124610,125672-126076 

404996 6007890 Plus 37999-38145,38652.38998.39727<3g872.4055740874.4235142450 
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TABLE 13A: Genes Distinguishing Non-Mal^nanl Lung (^ease trom Lung Tumors and Normal lung 

Table 13A shows about 23 genes upregulated in nofMnarignant lung (fisease relaSve to lung tumors and nomial lung. These genes were selected from about 59680 pnbesets on 

the Eos/Affymeirix Hu03 Genechip array. 

Table 13B show the accession numbers for those Pke/s laddng UnigeneKys for table 13A. For each probeset we have lisfod the gene duster number fom which the 
oUgonucleoiidesweieiie^gned. Gene dusters were oompiled using seqtiences derived fiomGenbank^ These sequences were dustered based on sequence 

similarity using Ousteihg and Alignment Toots (DouUeTwist, Oaldand CaHfomia). The Genbank accession numbeis for sequences comprising each duster are Rsled in the 
'Accession' column. 

Table 13C show the genomic posittoning for those Pkey's laddng Unlgene ID'S and accession numbers in table laA. For each predicted exon, we have listed the genomic 
sequence source used for predclion. Nudeotide locatkm of e^ 



Pkey: Unique Eos probeset identifier number 

ExAocn: Exemplar Accession number. Genbank accession number 

UnigenelD; Unlgene number 

UnigeneTHIe: Unigenegenefitis 

R1: Average of lung tumors (induding squamous cell cardnomas. adenocarcinomas, small oeH cardnomas, granulomatous and cardnoHj iumxs) dMded by the 
average of nomna) lung samples 

R2: Average of norvmatignant lung disease samples Qnctudlng bronchitis, emphysema, fibrosis, atelectasis, asthma) divided by the average of normal hmg samples 



Pkey 


ExAccn 


UnigenelD 


Urtigene Title 


R1 


R2 


408562 


AI436323 


Hs.31141 


Homo sapiens mRNA for Kl AA1 568 protein. 


1.00 


230.00 


409031 


AA376836 


HsJ6728 


ESTs 


1.00 


i2aoo 


412372 


R65998 


Hs^5243 


hypothetical protein l=LJ22029 


1.00 


i7aoo 


415910 


U203SO 


H878913 


chemoldne (C>X3^ receptor 1 


1.00 


145.00 


417511 


AL049176 


Hs.82223 


chordin-like 


1.00 


179.00 


418819 


AA228776 


Hs.191721 


ESTs 


1.00 


140.00 


422060 


R20893 


Hs^823 


ESTs, IModerat^8initlarbALU5.HUiyiAN A 


1.00 


156.00 


4245B5 


AA464840 


Hs.131987 


ESTs 


1.00 


167.00 


426753 


T89832 


Hs.170278 


ESTs 


1.00 


141.00 


429498 


AA453800 


Hs.ig2793 


ESTs 


1.00 


138.00 


430719 


AA488988 


Hs^98 


ESTs 


1.00 


moo 


431089 


BE041395 




ESTs. Weakly similar to unknown protein 


23.32 


941.00 


431385 


BE178536 


Hs.11090 


membrane-spanning 4-domalns, subfsrily'A 


1.00 


157.00 


431726 


NiyL007351 


H&268107 


miAimerin 


1.00 


157.00 


436532 


AA721S22 




gbaw54h12j1 NCLCGAP^ Honnsaptens 


1.00 


2iaoo 


437960 


AI669586 


Hs.222194 


ESTs 


1.00 


147.00 


438202 


AW169287 


Hs.22588 


ESTs 


1.00 


14100 


441499 


AW298235 


Hs.101689 


ESTs ^ 


1.00 


167.00 


444513 


AL120214 


Hs.7117 


ghitamate receptoTi fonoiropiCi AMPA 1 


1.00 


15100 


448253 


H25899 


H&201591 


ESTs 


1.00 


14100 


453635 


R67837 


H&169872 


ESTs 


1.00 


iiaoo 


458332 


AI000341 


H&220491 


ESTs 


1.00 


19Z00 


459587 


AA03ig56 




gb2k15e04.s1 SoaresjiregnanLuterusJlbH 


1.00 


154.00 



TABLE 13B 



Pkey: Unk|ue Eos probeset ktenGflernutnber 
CAT number Gene duster number 
Acoess faf c Genbank aocesston numbere 

Pkey CAT Number Accesskm 

431089 327825J BE041395AA4gi826AA621946AA715960AA66G1Q2 
436532 42180^1 AA721S22AW975443T93070 



TABLE 130 

Pkey: Unique number conesponding to an B)s probeset 

Refc Sequenoesouros. The 7 digit numben in Ihto column are Gsnbarikklenfiifor (GO ntmibefs. DuntiamLetaL'relisratothepubilcaSonen&OadThaDNA 

sequenoe of human chromosome 22.' Dunham L at aL, Nature (1999) 402:489495. 

Strand: Indfoatea DMA strand finom which axons were predicted. 

Nt_podtion! indteates nudeolUe posffions of predictad exons. 

Pkey Raf Strand ICposlioR 

402075 8117407 PhJ8 121907.122035.122804.122921.124019.124161.124455.1246iai25672-126076 
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TABLE 14A: Preferred UGlity and Subcellular Locafization for Potential lomg Disease Targets 

TdUe 14A shows the suboeDularlocalizaiion and preferred utHity tor ttie genes appearing in Tables 9A and 1QA. mAb symboltzes monoclonal antibody, dlag symboroes 
diagnoslic s^. symbolizes small molecule, and CTL symbo&ss cytotoxic lymphCM^ rigand. These genes were seteded from 59680 probesets on ttie Eos/Affymelrix Hu03 
Genechip array. 

TaUe 14B show the accession numbers for those Plcey's lacking UnigenelD's for table 14A. For each probeset wa have listed the gene duster number from which the 
ol^ueteoSdss ware designed Gene dusters were compiled using sequences d&wed from Genbank ESTs and mRNAs. These sequences were dustered based on seQuence 
simllamy using Oustering and Alignment Tools (CkKibleTwist Oakland Califomia). The Genbank accession numbers for sequences oompiislng each duster are listed in flie 
"Aooeraion" oduRVt 

TaUe14C8howtbegenontepo6ilkmingfbrthosePkey^ For each ptedicted axon, we have listed the genomic 

sequence source used for predldion. Nucleotide kicaAons of each predidBdexon are also Ested. 



Ptoy: Unique Eos probeset Identillar number 

ExAccn: Exemplar Accession number, Genbank acoesskm number 

UntgenelO: Unigene number 

Unigene Title: Unigene gene title 

PrBf.Ut!fi^ PrsferredUtiSty 

Pred.Loc: Precficted subcellular localization 



Pkey 


ExAccn 


UnlgenelD 


Unigene Title 


Pref Utility 


PreiLoc 


400289 


X07820 


Hs.2258 


matrix mstaHoprDteinasa 10 (siromeiy^n 


mAb&di2ig&s.a 


extnacelhitar 


400303 


AA242758 


Hs.79136 


UV-I protein, estrogen regulated 


mAb 


plasma membrane 


40^5 






ENSP00000251056*'Plasma membrane calcium 


1 mAb & dlag 


secreted 


407811 


AW19O902 


Hs.40098 


cysteine knot superfanfly 1 , BMP antagon 


diag 


secreted 


408243 


Y00787 


Hs.624 


Inlerleukin 8 


diag 


secreted 


408790 


AW580227 


Hs.47a60 


neurotrophic tyrosine kinase, receptor, 


mAb&s.m. 


plasma membrane 


408908 


BE296227 


Hs^50822 


sertne/ltireonine kinase 15 


sm 


cytoplasm 


405041 


AB033Q25 


Hs.50081 


Hypothetical protein, XP_051860 (KIAA1 19 


CTL&diag 


secreted 


409103 


AF251237 


Hs.1122a8 


XAGE'1 protein 


ca 


nudear 


409420 


Z15008 


Hs.54451 


laminin, gamma 2 (nican (lOOkO), kalini 


diag 


secreted 


409632 


W74001 


H5.55279 


serine (or cysteine) pratoriase inhi*blto 


dlag 


secreted 


409757 


NM.001B98 


H5.123114 


cystafin SN 


diag 


extracellular 


409B93 


AW247090 


H5.57101 


rTiii^chrDrTx>some maintenance deHctent {S< 


ca 


nuclear 


409956 


AW103384 


Hs.727 


inhibin, beta A (acSvin A. activin AB a 


diag 


extracellular 


410001 


AB041036 


Hs.57771 


kAeinll 


diag 


extracsllular 


410407 


X66B39 


Ks.63287 


carbonic anhydrase IX 


mAb&sm 


plasma membrane 


410418 


D31382 


Hs.63325 


transmembrane protease, serine 4 


mAb & diag & Sin. 


plasma memtsana 


412140 


AA219891 


Hs.73625 


f^6 interacting, kinesin-lika (rabkines 


s.m. 




412719 


AW01661D 


Hs.816 


ESTs 


SJIL 


nudear 


414774 


XD2419 


H8.77274 


plasminogen activator, urokinase 


diag 


exireodluiar 


414683 


AA926960 




CDC28 protein kinase 1 


S4n. 




415138 


C18356 


Hs.295944 


tissue factor pathway inhibitor 2 


CTL&diag 


extracellular 


415669 


NhL005025 


H5.76S89 


serine (or cysteine) proteinase inhibtto 


mAb & diag &s.m. 


secreted 


41S817 


U88967 


H8.78867 


pKMn t]^osine phosphatase, leoeptor-t 


mAb&slmi 


plasma menibrane 


4166S8 


UQ3272 


Hs.79432 


Gbfflin 2 (ooi^nttal conlradura) ara 


diag 


extracelhilar 


417034 


NM_006183 


H3.80952 


neurotensin 


(Sag 


extracellular 


417079 


U65590 


Hs.ei134 


intsrleukin 1 receptor antagonist 


diag 


extraceliutar 


417308 


H60720 


Hs.81892 


KIAA0101 gene product 


s.m. 


mitodiondrial 


417389 


BE2G0964 


Hs.e2045 


midkine (neudte growth-promoting factor 


mAb&dlag 


secreted 


417433 


BE2702e6 


Hs.82128 


5r4 oncofetal trophoblast ^yeopicKieSn 


mAb 


plasma membrane 


417933 


X02308 


HS.B2962 


thymidylate synthetase 


8.m. 


endoplasmic retindum 


418478 


U88945 


Hs.1174 


cyClin^ependent kinase inhibitor 2A (me 


s.m. 


cytoplasm 


418508 


AA084248 


Hs.e5339 


G protein^upled receptor 39 


mAb&sjn. 


plasina membrane 


418678 


NMJI)01327 


H8.1 67379 


cancer/tesfis antigen (NY-ESO-1) 


CTL 


cytoplasmk: 


419121 


AA374372 


HS.B9626 


parathyroid honnone-Tike homtone 


diag 


secreted 


419171 


NM_002846 


HS.8985S 


protein tyrosine phosphatase, receptor t 


mAb&s.m. 


plasma membrane 


419183 


U606&9 


Hs.89563 


cytochrome P450, subfamily XXIV (vitamin 


CTL&5.m. 


mitochondrial 


419216 


AU076718 


Hs.164021 


smsfi inducible cytokine subfamily B (Cy 


diag 


secreted 


419235 


AW470411 


Hs.288433 


neurotrimin 


mAb&dlag 


idasma membrane 


419452 


U33635 


Hs.90572 


PT1C7 protein tyrosine kinase 7 


mAb&s.m. 


plasma membrane 


419556 


U29615 


Kls.91093 


cMtlnase 1 (chitotriosldase) 


mAb&dlag 


extracellular* 


420610 


AI683183 


Hs.99346 


dtetsd^ess homeo box 5 


CTL 


nuclear 


421110 


AJ250717 


Hs.1355 


cathepslnE 


sm&diag 


extracellular 


421379 


Y15221 


Hs.103982 


small inducible cytokine subfamily B ((^ 


diae 


secreted 


421474 


U76382 


Hs.104637 


sohJte carrier family 1 (glutamate trans 


mAb&8.m. 


l^asma membrane 


421552 


AI=026692 


Hs.105700 


secreted fHzzled-related protein 4 


dlag 


secreted 


421753 


BE314828 


Hs.107911 


ATP-I^ing cass^e, sub<family B [Wni 


mAb&s.m. 


plasma membrane 


421817 


AF146074 


Hs.108660 


ATP-binding cassette, sub-family C (Cin'R 


mAb&s.m. 


plasma membrane 


422109 


S73265 


Hs.1473 


gastrin-roleasb^ pepfide 


dia9 


secreted 


422158 


L10343 


Hs.112341 


protease inhibitor 3, skin-derfved (SfCAL 


diag 


secreted 


422282 


AF019225 


Hs.1l430g 


apolipoprotein L 


diag 


secreted 


422283 


AW411307 


Hs.114311 


C^C45 (cell div^ cycte 45, Su^erevls 


s.m. 


nuctear 


422424 


AI186431 


Hs.296638 


prostate differentiation facbr 


dlag 


extracellular 


422765 


AW409701 


Hs.1578 


bacuioviral lAP repeatcontalning 5 (sur 


S.RL 


oytoi^m 


422809 


AK001379 


Hs.121028 


hypothefical protein FLJ10549 


&m. 


nudear 


422867 


L32137 


i4s.1564 


cartilage oUgomeric matrix protein (pse 


diag 


extraceflular 


422956 


BE545072 


Hs.122579 


ECT2 protein (Epithelial cell transfomii 


CTL&sm 




423634 


AW959908 


Hs.16g0 


heparin-binding growth factor binding pr 


diag 




423573 


BE003054 


HS.169S 


matrix metalk)protelnase 12 (macrophage 


mAb & diag &sjn. 


secreted 


423961 


013686 


Hs.136348 


periostin (OSF-2o5) 


mAb&dtag 


extracellular 


424046 


AF027866 


HS.1382Q2 


serine (or cysteine] proteinase inhlbito 


diag 


sacr^ed 


424381 


AA285249 


Hs.146329 


proteki kinase 0\kZ 


S.flL 


nudear 
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424502 


AF242386 


H$.149585 


424503 


NM_002205 


H$.149609 


424687 


J05070 


Hs.151738 


425247 


NM.005940 


H8.155324 


425322 


U63830 


Hs.155637 


425650 


NM_001944 


Hs.1925 


425734 


AP055209 


Hs.159396 


425776 


U25128 


Hs.159499 


425B52 


AK001504 


Hs.159651 


426215 


AWS63419 


tte,155223 


426427 


M86699 


H$.169840 


426514 


BE616633 


H$.170195 


427335 


AA448542 


Hs.251677 


427747 


AW411425 


Hs.180655 


428242 


K55709 


Hs^SO 


428330 


L22524 


Hs.2256 


428450 


NM.014791 


Hs.184339 


428479 


Y00272 


Hs.334562 


428484 


AF104032 


Hs.184601 


426664 


AK001666 


Hs.18g095 


428698 


AAe52773 


H5.334838 


426748 


AWSS3208 


Hs.98785 


428758 


AA433988 


H8.98502 


428869 


AF120274 


Hs.194689 


429211 


AF052693 


HS.19B249 


429263 


AA019004 


Hs.19e396 


429547 


AWQ09166 


H8.89376 


429610 


AB024937 


Hs.211092 


429903 


AL134197 


H5.93597 


430486 


BE062f09 


Hs^41551 


431462 


AW563672 


Hs;!56311 


431515 


Nlit012152 


Hsi585B3 


431846 


BE019924 


Hs^1580 


4319^ 


X63629 


Hs^77 


432201 


AI538613 


Hs^8241 


433001 


AF217513 


H&279905 


43S505 


AF2D0492 


Hs.211238 


4364aV 


AA379597 


Hs.5ig9 


437016 


AU076916 


Hs.53g8 


437044 


ALJ035864 


HS.69S17 


437789 


AI561344 


Hs.127812 


437852 


BE001836 


Hs.256897 


439223 


AW23e299 


Hs.2Sa618 


439477 


W69813 


Hs.58042 


439606 


W79123 


Hs^561 


439738 


BE246502 


Hs.95g8 


440006 


AKDO0S17 


Hs.6844 


441362 


BE614410 


Hs.23044 


442117 


AW864S64 


Hs.128899 


443247 


BE614387 


H5.333693 


443426 


AF098156 


Hs.9329 


443859 


NM.013409 


Hs.9914 


444006 


BE395085 


HS.1Q066 


444371 


BE540274 


Hs.239 


444381 


BE387335 


Hs.283713 


444761 


NM.014400 


Hs.1ig50 


445537 


AJ245671 


Hs.12844 


446619 


AU076643 


Hs.313 


446921 


AB012113 


Hs.16530 


447033 


AI3S7412 


Hs.157601 


447342 


A1199268 


Hs.19322 


448243 


AW369771 


HS.S2620 


448844 


AI581519 


Hs.177184 


449048 


Z45051 


Hs.22920 


449722 


BE280a74 


Hs.23960 


450001 


NM.001044 


Hs.406 


450975 


AA009647 




450701 


H39960 


Hs.288467 


450983 


AA30S384 


Hs.25740 


451668 


Z43948 


H5.326444 


452281 


T93500 


Hs.28792 


452401 


NM.007115 


Hs^S2 


452747 


6E1S3855 


Hs.61460 


452838 


1)65011 


H5.30743 


453968 


AAa47843 


Hs.62711 


457469 


Ai693815 


Hs.127179 



int^rin, alpha 5 (Hbronectin receptor, 
nnatrix metalloproteinase 9 (gelalinase B 
matrix metalioproteinase 1 1 (stromelysin 
protein kinase, DNAaMed. calalyfo 
desmogtein 3 (pemphigus vulgaris antigen 
peptidylglydne alpha-amtda&ig monooxyg 
par^yrold hormone receptor 2 
de^ receptor 6, TNF superiamiiy member 
stannlocatein 2 
TTK protein kinase 

bone morphogenetic protein 7 (osteogeiifc 
G antigen 7B 

serineAhreonine kinase 12 

(eutemia inhibitory factor (cholinergic 

matrix melallaproteinase 7 (matrilysln, 

KIAA01 75 gene product 

cell division cycle 2, G1 to S and G2 to 

solute carrier family 7 (calionic amino 

similar to SALL1 (sal {Drasophila^ 

KIAA1866 protein 

Ksp37 protein 

GA125 antigen; mudn 16 

artemin 

gap junction protein, beta 5 (connexin 3 
ATP-bindlng cassette, sub-family A (ABC1 
EST6 

LUNX protein: Pl-UNC (palate lung and nas 
cycMependent kinase 5. regulatory su 
chloride channel, caldum actlvaied, fam 
granin^ike neuroendocrine peptide piecu 
endothflUal dKCnentiafion, lysopltospha 
uroplaklnIB 

cadhenn 3, type 1, P-cadherin (placenta 
Transmembrane protease, serine 3 
ctoneHQ031OPRO0310p1 
tnterleukirvlhomolflgl 
HSPC150 pidein simDarto ublquiOn^ 
guanine monphosphato synthetase 
differentially expressed in FanoonTs an 
EST8. Weakly similar to T1 7330 hypotheti 
EST6.WbaklytimilartodJ36S01Z1 [H^ 
UL16 binding protein 2 

ESTs, Moderately similar to GFR3.HUMAN 6 
G protein-coupled receptor 87 
sema domain, immunogtobulin domain (Ig), 
NALP2 protein; pyRIN^ntalning APAF1-8 
RADS1 (S. ceievtslae) homotog (E ooD Re 
ESTs; hypothetical protein for IMAGE:447 
c^ target JP01 

chromosome 20 open reading frame 1 



^ype I transmembrane protein fii14 
toftdttad box lyil 

ESTs, Weakly similar to S64054 hypotheti 
GPt^dtored metastasis-associated prote 
EGF-Hke-domain, midtiple 6 
secreted ptrasphoprotein 1 (osteoponlin, 
small indudbto cytokine subfamily A (Cy 
ESTs 

Homo sapiens. Similar to RIKEN cDNA 2010 

unegrui, oeia o 

ESTs 

similar to S68401 (cattfe) glucose Indue 
cycDn B1 

solute carrier fiamily 6 (neurotransmllto 
8 d^irisgrin and metEdtoprototoase doma 
hypotheltoal protein XP_098151 (teudne- 
ER01 (S. cerevisiaeHike 
cartila^ acidic protein 1 
Homo sapiens cDNA FU11041 fis, clone a 
tumor necrosis factor, atpha^nduced pre 
Ig superfamily receptor liJIR 
preferentially expressed antigen In mela 
High mobili^ group (nonhistone chromoso 
cryptic gene 



s.m. 


cytoplasmic 


mAb&s.m. 


plasma membrane 


dlag 


extracellular 


mAb&diag&sm 


secreted 


s.m. 


cytoplasmic 


mAb 


piaSlllo niclilUiaJlo 


sjn. 




mAb&diag 


plasma membrane 


mAb&s.m. 


idasma membrane 


mAb&diag 


secreted 


CTL&s.m. 


nudear 


mAb&diag 


secreted 


CTL 


cytoplasmic 


s.m. 


cytoplasmic 


dlag 

mAb & dlag &sm 


exbacellular 


s.m. 


nuclear 


sjn. 


nuclear 


mAb&SJn. 


plasma memtirane 


CTL&sja 


nudear 


mAb 




diag 


extracellular 


dlag 


mHochodria* 


diag 


extraceOuiar 


mAb&sm 


plasma memtirane 


mAb&sm 


plasma membrane 


diag 




mAb&diag 


secreiBd 






mAb&sm 


plsma mamtyane 


dieg 


extracellular 


mAb&sm 


(dasma mentbrane 


mAb&diag 


plasma membrane 


mAb&diag 


plasma membrane 


mAb&diag&sjn. 


plasma membrane 
nudear 


S.RL 

diag 


secreted 


sm. 

sm, 


cytoplasm 


CTL 


ER 


CTL 


nudear 


mAb&sm 


plasma membrane 


mAb 


plasma memtirane 


mAb&sm 




mAb&sm 


pla^na membrane 


mAb&sm 


plasma membrane 


am 


nudear 


sm 

mAb&sm 


{toma merrriirane 


m 

\j\ L 


oAaceDidar* 


CTL 




dlag 


BKlfBOflllUlflf 


mAb 


ptesma memtvane 


sm 


nudear 


dlag 


secreted 


mAb&diag 


plasma Riemtirane 


mAb&diag 




diag 


secreted 


diag 

CTL & diag 




CTL 




hiAK Asm — 


plasma mennrane 


mAb&sm 




mAb 


ptasrTQ memt)rane 


sm 


cytoplasm 


mAb&sm. 


plasma membrane 


mAb&diag&sm 


{^asma mennrane 


mAb&diag 


plasina mernbrane 


diag 

mAb&diag 


|dasma men^Hane 


diag 
diag 


odraodlular 


mAb 


plaana memtvane 


CTL 


nudear 


CTL&s.m. 


nudear 



TABLE 14B 



Pkey: Urdque Eos probesel identifier number 
CAT number Gene duster number 
Accesston: Genbank accession numbers 



Pk^ 



CAT 



Accesston 
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414B83 15024 \ AA92S960AA9269S9VV76521 W24270VV21526AA037172BE267636 H83166AA46g909 N863g6AAO^ 

AA082436 H72S25 H77575 N4g786 W80S65 H78746 BE569085 W04339 R981 27 758938 BE279271 AW9G0304 T29812 AA476873 BE297387 
AA292753 AA177048 NM.001826 X54941 6E314366 AAg08783 A)7ig075 BE270172 BE26g819 AA889955 AI204630 W25243 At935150 
AAB72039 W72395 T99630 AI422691 H9B460 N31428 BE255916 H03265 A1857576 AA776920 AA91 0844 AA459522 AA293140 AW514667 
5 R75953 AW6B2396 AA6e2522 AJ865147 AI423153 AW262230 AA584410 AA583187 AW024595 AW0e9734 AI828996 AA282997 AAB76046 

AW613002AA527373AVV97245gAI831360AA621337AA100926AA772418AA594628Al033B92W95098AI0343^ 
N95210AI459432AI041437AAS32124AA6276B4AA93582SAI004827AI423513 AI094S97H42079R54703M^^ AA97B045 
AA643280W44561 AI991988AI537692 A)0gQ262AA740817AI312104 AI911822 AA416871 AI1B5409 AA12g784AA701623Al075239 
AI139549 AA633648 AI339996 AI336880 AA399239 AI078708 AI085351 A)382835 AI34661 8 AI146955 At989380 At348243 N92B92 AA765850 

1 0 AI494230 AJ27B887 AA962596 AI492600 W80435 AA001979 R97424 AI129015 N24127 AA157451 AA23554g AA459292 AA037114 AA12g785 

AM94211 AW053801 AW886710 R92790 NS9755 AI36112B AWSBg407 H4772S H97534 H48076 H46450 T99631 AW300758 H03431 R76789 
AA954344 H77576 R96823 AI457100 N92B45 N49682 H42038 BE22069B BE220715 HQSSSZ AA701624 N74173 RS4704 H79520 H72923 
H032666E261919AA769633AA480310 AA507454AA910Se6Ai203723AW104725W25611 W2S071 188980 H0351 3 777589 R991 56 
Wg5095 R97470AA702275T77551 AA911952 H82956 N83673AA283672 

15 450375 83327 1 AA009647AA131254AA374293AWg54405H04410AW608284AA151166BE157467BE157601 H04384 W48291 AW^ 

AAig0993 K03231 H59605 H01642 AAB52876 AA1 13758 AA626915 AA7469S2 AI161014 AA099554 R69067 

TABLE14C 

20 

Ptey: Unique number corresponding to an Eos probeset 

Ref: Sequence 80UIC8. The 7 digilnwnbeis in this column are Genbanklden6fier(G9mB^^ "Dunham Let a); refiers to the publication entitled The DNA 

sequence of human chramosome 2Z' Dunham I. et al.. Nature (1999) 402:489495. 
Strand: IndlcatasDNAstrand from which axons were predlcte^^ 
Ntjxjsifion: IndiratesnuclecdMe positions of predicted exons. 
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ney Ref Sliand NLpoBUn 

402075 8117407 Plus 121907-122036.122804-122921.12«)ig-1»181.124455-12«10,12SB72-126076 
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TABLE ISA: Information for all sequances m Table 16 

Table ISAshows the Seq ID No, Pkey, BcAecn, UnlgcnelD, and IWgeneTIBefor all of the sequences In TaWe 16. 

Table 15B show Oie accession nunibers for those Pkey"s laddng UnigenelD's for table 15A. For each probeset we have listed the gene cluster number from «^ch the 
oUgonudeotides were designed. Gene dusters were compiled using sequences derived from Genbank ESTs and mRNAs. These sequences were dustered based on s^uenoe 
sNtarity Clustering and AlignmentTods (DoidieTwisl, Oakland Cdtlbraia). The Genbank acoesskm numbers for sequences comprising each duster are bsted n the 
"AccessionT column. 

Table 15C show the genomic positioning for those Pke/s lacking Untgene ID'S and accession numbers in table 15A. For each predicted exon, we have isted the genorric 
sequence source used for predfotion. Nudeotide locattons of each predicted exon are elso fisted. 



Seq to No: Sequence ID number 

Pkey: Unique Eos probeset Men&fier number 

ExAccn: Ex^nplar Accession number, Genbank accession number 

UnigenelD: Unlgene number 

Unlgene Title: Unlgene gene title 



Seq ID No: 

SeqlDNo:1&2 
SeqlDNo:3&4 
SeqlDNo:5&6 
Seq 10 No: 7& 8 
Seq ID No: 9& 10 
Seq ID No: 11 & 12 
Seq ID No: 13 & 14 
Seq ID No: 15&16 
Seq ID No: 17&18 
Seq ID No: 19 & 20 
Seq ID No: 21&22 
Seq ID No: 23 & 24 
SeqtDNo:25&28 
Seq ID No: 27 & 28 
Seq ID No: 29 & 30 
Seq ID No: 31 &32 
Seq ID No: 33 & 34 
Seq ID No: 35 & 36 
Seq ID No: 37 & 38 
Seq ID No: 39 & 40 
Seq ID No: 41 & 42 
Seq ID No: 43 & 44 
Seq ID No:45&46 
Seq ID No: 47 & 48 
Seq ID No: 49 
Seq ID No: 50 & 51 
SeqlDNa52&S3 
SeqlDNo:54&55 
Seq ID No: 56 & 57 
SeqiDNaSB&Sg 
SeqlDNa60&61 
Seq ID No: 62 & 63 
SeqlDNa64&65 
Seq ID No: 66 & 67 
Seq IDNo:66&69 
Seq ID No: 70 A 71 
SeqlDNo:72&73 
Seq ID No: 74 & 75 
Seq ID No: 76 & 77 
Seq ID No: 78 & 79 
Seq ID No: 80 & 81 
Seq ID No: 82 & 83 
Seq ID No: 84 & 85 
Seq ID No: 86 & 87 
Seq lDNo:88&89 
Seq ID No: 90 & 91 
Seq ID No: 92 & 93 
Seq ID No: 94 & 95 
Seq ID No: 98 & 97 
Seq IDNo:98&99 
Seq ID No: 100 & 101 
Seq ID No: 102 & 103 
Seq ID No: 104 & 105 
SeqlDNo:106&107 
Seq ID No: 108 & 109 
SeqlDNo:110&111 
SeqlDNo:112&113 
SeqlDNo:114&115 
Seq ID No: 116 
SeqiDNo:117&1ie 
Seq ID No: 119 & 120 
Seq ID No: 121 & 122 
Seq ID No: 123 & 124 
SeqlDNo:12S&126 



Pkey 


ExAccn 


UrrigenelD 


Untgene THIa 


410407 


XB6839 


Hs.63287 


caiboidc anhydrase IX 


412719 


AW016610 


Hs.616 


ESTs 


417034 


NliL006183 


Hs.60g62 


neurolensn 


430486 


BE062109 


H&241551 


chloride channel, caidum activated, fm 


407788 


BE514982 


Hs.36991 


S100 cakjum-bindng protein A2 


407788 


BE514982 


H&389gi 


S100 cddum-binding protein A2 


^7768 


6E514982 


Hs38991 


S100 ealdum-blixfog protein A2 


407788 


eE514982 


Hs.38991 


S100 caidum^*n(fing protein A2 


439285 


AL133916 




hypothetical protein FIJ20093 


413753 


U17760 


Hs.75517 


taminin, beta 3 (niceln (125kD). fcaiinin 


120486 


AW368377 


Hs.137669 


turnor protein 63 kOa with strong homotog 


425650 < 


! NII/L001944 


Hs.1925 


desnio^n 3 (pemphigus vulgaris anOgen 


412140 


AA219891 


Hs.73625 


RAB6 interading, kinesnvlike (rSbkines 


423673 


BE003054 


Hs.1695 


matrix metalloprdeinase 12 (macrophage 


452838 


U65011 


Hs.30743 


preferentially expressed antigen in mala 


416663 


AK001100 


H8.41690 


desmraoDbiS 


418663 


AK001100 


Hs.4iego 


desmQooIlin3 


409632 


W74001 


Hs.55279 


serine (or cysteine) proteinase inhiblto 


429610 


AB024937 


Hs.211092 


LUNX protein; PLUNC (palate lung and nas 


406630 


M29540 


Hs.220529 


cardnoembryonic antQen-related cell ad 


431846 


BE019924 


Hs.271580 


uroplakin IB 


418830 


BE513731 


HS.889S9 


hypothetical protein MG04616 


424098 


AF077374 


H5.1 39322 


small pnsQne^^ protdn 3 


443648 


A1085377 


Hs.143610 


ESTs 


311034 


BE567130 


Hs.311389 


ESTs, Highly similar b NKGD.HUMAN NKG2- 


408522 


AI541214 


Hs.46320 


Smkt praQne^ protein SPRK pnnnani 


422158 


110343 


Hs.112341 


protease inhibitor 3. skiiHlerived (SKAL 


435505 


AF200492 


Hs.211238 


Inte^uldn'l homolog 1 


417366 


BE185289 


Hs.1076 


small proline^ protein IB (comifin) 


431958 


X63629 


HS.2&77 


cadherin 3, type 1. P-cadherin (placenta 


441020 


W79283 


Hs.35962 


ESTs 


423217 


NM.000094 


Hs.1640 


collagen, type VII. alpha 1 (epidermolys 


429538 




Hs.11261 


small F^ne-ridi prot^ 2A 


448733 


NM.005629 


H&167958 


sc^ute carrier family 6 (neurotransmltte 


444371 


BE540274 


ite.239 


fbridieadboxMI 


444371 


BE540274 


H&239 


foiUieadboxMI 


444371 


BE540274 


Hs.239 


forkheadboxMI 


422168 


AA5B6694 


Hs.112408 


SI 00 cakaum-blndiRg protein A7 (psorias 


422166 


AA586894 


H&l 12408 


S100 cddunvblnding protein A7 (psorias 


429259 


AA420450 


Hs^2911 


Plakophmn 


426440 


BE382756 


Hs.ie99Q2 


sokite carrter tetrfly 2 (tediaatod glu 


437044 


AL035864 


Hs.69517 


diftarenBafly expressed In FanoonTs an 


423662 


AK001035 


H8.130881 


&cell CLUiymphoma 1 1 A (zinc finger pro 


428484 


AF104032 


Hs.184601 


solute carrierfamOy 7 (cafiodc amino 


429211 


AF052693 


Hs.198249 


gap junc&on protein, bete 5 (connexin 3 


417389 


6E260g64 


HS.82D45 


midl^tne (neurits growth-promoting factor 


423534 


AW959906 


Hs.1690 


heparin<tindlng growth fiactor binding pr 


417515 


U4203 


H8.82237 


ataxia4elangiectasla group D-associated 


441362 


BE614410 


HsJ23044 


RA051 (S. cerevlsiae) homolog (E cdl Re 


425322 


U63630 


Hs.155637 


protein kbtse. DNAa:tivated, cateiytic 


449003 


X76342 


K5.389 


alcohol dehydrogenase 7 (class IV), nw o 


431009 


BEU9762 


Hs/48956 


gap function proteint bete 6 (connexin 3 


409103 


AF251237 


Hs.112208 


XAGE-lprotdn 


417542 


J0412g 


Hs.82269 


progest^erMBSOCtated endometrial prote 


428471 


X57348 


H5.184510 


stra&fin 


418004 


U37519 


Hs.87539 


aktehyde delqfdrogenase 3 family, member 


414761 


AU077228 


Hs.77256 


enhancer of zeste (Drosophila) homdog 2 


418203 


X54942 


Hs.83758 


CDC28 protein kinase 2 


447343 


AA256641 


Hs.23^ 


ESTs, l^hly similar to S02392 alpha-2-^ 


437016 


AU076916 


Hs.S3g8 


guanine monphosphate synthetase 


449230 
446989 


BE613348 


Hs^11579 


melanoma cell adhedon nratecute 


AKOOISOS 


Hs.16740 


hypothetfoal proton FLI11036 


457819 


AA057484 


Hs.35408 


ESTs, HigMy similar to unnamed protein 


424687 


JKOTO 


H5.151738 


matrix metaOopiottinase 9 ^daHnase B 
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SeqlDNo:127&128 414430 

SeqlDNa129&130 418462 

$eqlDNo:131&132 100668 

SeqlDNo:133&134 458933 

SeqiDNo:135&136 418478 

SeqlDNo:137&138 418478 

SeqlONo:139&140 418478 

SeqlDNo:141&142 418478 

SeqlDNo:143&144 446269 

SeqlONo:145&146 422765 

SeqlDNo:147&148 436481 

SeqlDNo:149&1S0 440325 

SeqiDNo:151&152 439606 

SeqlONo:153&154 453884 

SeqlDNo:155&156 453864 

SeqlDNo:157&1S8 4538B4 

SeqlDNa1S9&160 453BB4 

8eqlDNo:161&162 404877 

SeqlDNo:163&164 413129 

SeqIDNo:165&166 413281 

SeqlDNo:167&168 444781 

SeqIDNo:169&170 416819 

Seq1DNo:171&172 451320 

SeqIDNo:173&174 418543 

SeqtDNo:175&176 454034 

8eqtDNo:177&178 425397 

8eqtDNo:17g&160 415817 

SeqlDNo:181&162 415B17 

SeqlDNo:163&184 415817 

SGqlDNo:165&186 415817 

SeqlDNo:187&188 415817 

SeqlDNo:189&190 419121 

8eqtDNo:191&192 448893 

SeqlDNQ:193&194 421817 

SeqiDNo:195&196 430393 

SeqlONo:197&198 425067 

Se()lONo:199&200 420482 

SeqlDNo:2D1&202 102963 

SeqIDNo:2038i204 100576 

SeqIDNo:205&206 101175 

SeqlDNo:207& 208 429038 

8eqlONo:209&2tO 418678 

86qlDNo:211&212 418678 

8eq(ONo:213&214 131927 

SeqlDNo:21S&216 428182 

SeqlDNo:217& 218 427335 

SeqlDNo:219&220 409420 

SeqlDNo:221&222 114346 

SeqlDNo:223 & 224 438956 

SeqlDNo:225 &228 404440 

SeqlONo:227&228 415669 

8eqDNo:229 &230 103312 

8eqn)No:231&232 320843 

SeqIONo:233 429065 

SeqlDNo:234 &235 446102 

SeqlDNo:236 & 237 330495 

SeqlDNo:238 413573 

8eqlDNo:239 & 240 428479 

SeqlDNo:241&242 428479 

SeqlDNo:243& 244 332180 

SeqIOMo:245 437915 

SeqlDNo:246&247 441553 

SeqU}No:248 & 249 331692 

SeqlONo:250 & 251 429413 

SeqlDNo:252 & 253 422283 

SeqlDNo:254 &25S 448357 

8eqiDNo:2S6&2S7 446292 

8eqlDNo:258 &2S8 416209 

SeqtDNo:260 & 261 453922 

SeqlDNo:262 & 263 424046 

SeqlONo:264& 266 439223 

SeqlDNo:286 &2B7 429228 

Seq1DNo:288 & 269 409757 

SeqlDNo:270&271 411089 

SeqlONo:272&273 436511 

SeqlONo:274&275 428969 

8eqlONo:276a 277 428969 

SeqlONQ:278&279 428969 

8eq)DNo:280&281 428969 

SeqlDNo:282 407137 

SeqtDNa283 & 284 412723 

SeqtDNo:285& 286 450701 

8eqlDNo:287&288 405770 

Seq ID No: 289 £290 439453 

SeqlDNo:291&292 414774 



AI346201 


Hs.76118 


BE001596 


Hs.85266 


L05424 


Hs.169610 


A1638429 


HS54763 


U38945 


Hs.1174 


U38945 


Hs.1174 


U38945 


Hs.1174 


U38945 


Hs.1174 


AW263155 


Hs.14559 


AW409701 


Hs.1578 


AA379597 


Hs^199 


NMl.003612 


Hs.7164 


W79123 


K5.5&561 


AA355925 


Hs.36232 


AA355925 


Hs.36232 


AA355925 


Hs.36232 


AA35592S 


Hs.36232 


AF292100 


Hs.104613 


AA861271 


Hs.222024 


N^L014400 


Hs.11950 


U77735 


Ks.80205 


AW118072 




NI^00S329 


Hs.85962 


NM 000691 


Hs.575 


J04088 


Hs.156346 


U8B967 


Hs.78867 


U86967 


Hs78867 


U88967 


Hs.78867 


068967 


Hs.78867 


U88967 


Hs.7a867 


AA374372 


Hs.89826 


A(471630 


H8.8127 


AF146074 


Hs.108660 


BE185030 


H8.241305 


AAB26434 


Hs.1619 


AF050147 


Hs.07932 


XD2404 


- Hs.274534 


X00356 


HS.370S8 


U82571 


Hs.36980 


AL023513 


H8.194768 


NiyL001327 


H8.167379 


NiyL001327 


Hs.167379 


AJ003112 


Hs.34780 


BE386042 


H8.293317 


AA448542 


Hs.251677 


Z15008 


Hs^l 


AL137256 


H&130489 


W00847 


HS.13S0S6 


NML00a)25 


H5.78589 


Y12642 


Hs.3185 


BE06%88 


Hs.34744 


Af753247 


HS.29&43 


AW166087 


Hs.317694 


U47924 


H5.71642 


AI733869 


Hs.148089 


Y00Z72 


HS.3345G2 


Y00272 


Hs.334562 


AF134160 


Hs.7327 


AI637993 


Hs.202312 


AA281219 


H8.121298 


AI683487 


HS.1S2213 


NM-0140S8 


Hs.201877 


AW411307 


Hs.114311 


N20169 


Hs.108923 


AF081497 


H8i79682 


AA238776 


Hs.79078 


AR)53306 


Hs.36708 


AF027866 


Hs.138202 


AW236299 


Hs.250618 


A1S53633 


Hs.326447 


NM.001898 


Hs.123114 


AA456454 


Hs214291 


AA721252 


Hs.291502 


AF120274 


Hs.194689 


AF120274 


Hs.194689 


AF120274 


K5.194689 


AF120274 


Hs.194689 


T97307 




AA6484S9 


HS.33S951 


H39960 


H8.2B8467 


BE264974 


lie eccc 


X02419 


H8.77274 



ubiquiiin cartocyl-tenninal esterase LI 
integrin,beta4 

CD44 antigen (homing iuncfkm and Indian 
RAN binding protein 1 
cyciin^pendent Idnase inhibitor 2A {me 
cyduHlependent kinase inhibitor 2A (me 
(^n-dependent Idnase inhibitor 2A (me 
cycQn-dependent Idnase inhibitor 2A (me 
hypothetical protein FLJ10540 
baculovira! lAP rspeatcontainhg 5 (sur 
HSPC150 proton similar to ubiquHiivcon 
a disintegrin and metaltopratBinase doma 
G protein^upled receptor 87 
K1AA0168 gene product 
KIAA0188 gene product 
K1AA0186 gene product 
K1AA0186 gene product 
NM_005365:Homo sapiens melanoma anigen, 
RP42homotog 
transcription factor B1^AL2 
GPI-anchored metastasis-associated prole 
(^m-2 oncogene 

diacy^iycerol idnase. zeta(104kO) 
hyaluronan synthase 3 
atdeh^ dehydrogenase 3 ^ily, member 
topoisomerase (DNA) II alpha (170lcO] 
proidn tyrosine phosphatase, receptor-t 
protein tynisine phosphatase, recsptor-t 
protein tyrosine phosphatase, receptor-^ 
protein tyrosine phosphatase, receptor-t 
protein lyrosine phosphatase, receptor-t 
parathyroid honnone-Uke homfione 
KIAA0144gene product 
ATP-binding cassette, sub-family C (CFTR 
estrogen-responsive B \m protein 
achaete-scute complex (Drosophfla) homol 
chondfomodulin I precursor 
calcitonin^lated polypeptide, beta 
caidtonin/batcllonin-related poiypeptid 
melanoma anfigen, family A, 2 
seizure related gene 6 (mouseHBce 
cancecftesfis antigen (NY-ESO-I) 
cancerAesBs antigen (NY-ESO-I) 
doufafeoortex; r^senoephaly. ^Maei (d 
ESTs. Weakly similar to GGC1J1UMAN 6 ANT 
Ganfigen7B 

laminin, gamma 2 (nicein (lOQkO), kallnl 
ATPase, aminophospholipid transporter-fl 
Human DNA sequence from done RP5-850E9 
NM.021048:Homo sapiens melanoma antigen, 
serine (or cysteine) proteinase inhlbito 
lysosanal 

Homo sapiens mRNA: cDNA DKFZp547C136 (fr 
Homo sapiens cDNA FU13103 fis, done NT 
ESTs 

guanine nudeo&ie binding protein (G pr 
ESTs 

odi division cycle 2, 61 to S and G2 to 
cell division cfda 2, G1 to S and G2 to 
daudinl 

Homo sapiens done Nil NTera2D1 teratoca 
ESTs 

winglBSS-lype MM1V intsgralioR site fami 
DESC1 protein 

CDC45 (ceD divi^on cyde 45, Sxerevis 

RAB38, member RAS oncogene family 

RhlypeCglyooprotain 

WD2 (mRoSc arrest deficient, yeast, h 

budding uninhibited by benzimidazdes 1 

serine (or cysteine) proteinase InUbib 

UL16 binding proldn 2 

ESTs 

cystalinSN 

cell division cyde Mike 1 (PiTSU^ pr 

ESTs 

artomin 

arterrin 

artenrA) 
artemin 

gb:ye53h05.s1 Soares fetal liver spleen 
hypothetical protein AF301222 
hypalheticatproteinXP_098151 (leudne- 
NiyL002362:Homo sapiens mdanoma antigen, 
thyroid hormone receptor interactor 13 
plasminogen adivator. urokinase 
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SeqlONQ:293& 294 424629 M90856 H&.151393 

SeqlDNo:29S&2g8 437789 A1581344 Hs.127812 

88qlDNo:297&2g8 437789 AI581344 Ks.127812 

SeqlDNo;299 & 300 437789 AI581344 Hs.127812 

5 Seq ID No: SOU 302 437789 AI581344 Hs,127812 

SeqlONo:303& 304 437789 A1581344 H&127812 

SeqlDNa305 & 306 453366 AA847843 Hs.62711 

Sei|IDNo:307 &30B 403478 

Seq ID No: 309 441525 AW241887 Hs.127728 

10 Seq ID No: 310 & 311 434105 AW952124 H5.13094 

Seq ID No: 312 & 313 428810 AFD6B236 Hs.193788 

S6qlONo:314&315 413691 AB023173 Hs.75478 

Seq ID No: 316 & 317 423934 U89995 Hs.159234 

Seq ID No: 318 & 319 409228 R16811 H5.22010 

15 Seq ID No: 320 & 321 425734 AF056209 Hs.159396 

Seq ID No: 322 & 323 413582 AW295647 Hs.71331 

Seq ID No: 324 & 325 438403 AA806607 Hsi9220S 

Seq1DNo:326 &3Z7 403329 

Seq ID No: 328 £ 329 409893 AW247090 HsJ7101 

20 Seq ID No: 330 & 331 119073 BE245360 Hs.279477 

Seq ID No: 332 & 333 113195 H83265 Hs.8881 

Seq ID Nix 334 & 335 102283 AW161552 Hs.83381 

Seq ID No: 338 & 337 101345 NM.00579$ Hs.15217S 

Seq ID No: 338 & 339 103280 U84722 Hs.76206 

25 Seq ID No: 340 & 341 102012 BE259035 Hs.1 18400 

Seq ID No: 342 & 343 105729 H46612 Ks.2d3815 

Seq ID No: 344 & 345 134299 AWS80938 HB.97ig9 

Seq ID No: 346 & 347 412719 AWD16610 Hs.816 

Seq ID No: 348 4 349 422158 L10343 Hs.112341 

30 Seq ID No: 350 & 351 128924 BE27S383 H&26557 

SeqlDNo:352& 353 100486 T19006 Hs.10842 

SeqlDNo:354 &3S5 419121 AA374372 Hs^6 

Seq ID Na 356 & 357 409459 DB6407 Hs.54481 

Seq ID Na 358 & 359 330493 M27826 

35 Seq ID No: 360 & 361 417B66 AW087903 Hs.82772 

Seq ID No: 362 & 363 418113 A1272141 HS.B3484 

SeqDNa364&365 437016 AU076916 Hs.53g6 

SeqlDNo:366 & 367 429612 AF062649 HS.252S87 

Seq ID No: 368 & 369 440704 M69241 Hs.152 

40 Seq ID No: 370 & 371 431221 AA449015 H&286145 

SeqlDNo:372 & 373 431565 AF161470 Hs^60622 

SeqDNo:374&375 431565 AF161470 H8^22 

SeqlDNo:376 &377 132354 8E18S289 Hs.1078 

Seq ID No: 3784379 424441 X14850 Hs.147097 

45 Seq ID No: 380 & 361 103768 AFO86009 H5.29S398 

Seq ID No: 382 & 383 417512 X76534 Hs.82226 

SeqIDNa3a4& 385 42S266 J00077 HB.15S421 

Seq ID No: 386 & 387 424503 NML002205 H5.149609 

SeqIDNo:3BB&389 400269 X07820 Hs.2258 

50 Seq ID No: 390 & 391 418007 M13509 H5.83169 

Seq ID No: 392 & 393 418007 M13S09 H5.63169 

SeqlDNo:394 &395 418738 AW388633 Hs.6682 

Seq ID No: 396 & 397 415138 C18356 Hs.295944 

Seq ID Na 398 4 399 418506 AA084248 H5.B5339 

55 Seq ID No: 400 & 401 423S61 D13665 Hs.136348 

Seq ID No: 402 & 403 414812 X72755 Hs.77367 

Seq ID Na 404 & 405 417433 BE27Q266 Hs.82128 

Seq ID Nr 406 & 407 417433 BE270266 Ks.82128 

Seq ID Na 408 & 409 422867 L32137 Hs.1584 

60 Seq ID Na 410 & 411 428227 AA321&49 H5^48 

Seq ID Na 412 & 413 444381 BE387335 H5.283713 

SeqI0Na414& 415 400303 AA24275B HB.7B136 

Seq ID Na 416 & 417 411789 AF2455Q5 Hs.72157 

Seq ID No: 418 & 419 428698 AA852773 Hs.334838 

65 Seq ID Na 420 & 421 -450098 WZ724g Hs.8109 

Seq ID Na 422 4 423 421552 AF026692 Hs.105700 

SeqlDN«4244425 452747 BE153855 H5.61460 

Seq ID Na 426 4 427 450375 AA009647 

SeqlDNa4284429 426215 AW963419 H5.155223 

70 Seq ID Na 430 4 431 425247 Nrit.005940 Hs.165324 

Seq ID Na 432 4 433 432201 AIS38613 Hs^8241 

Seq ID Na 434 4 435 427685 D311S2 Hs.17972g 

Seq ID Na 436 4 437 442117 AW864964 Hs.128899 

Seq ID Na 438 4 439 431211 M86B49 Hs.323733 

75 Seq ID Na 440 4 441 447033 AI357412 Hs.157801 

Seq ID Na 442 4 443 447033 AI357412 H$.157e01 

Seq ID Na 444 4 445 447033 AI357412 Hs.157601 

Seq ID Na 446 4 447 115522 BE614387 Hs.333893 

Seq ID Na 448 4 449 410418 031382 Hs.63325 

80 Seq ID Na 450 4 451 403041 AB033025 Hs.50031 

Seq ID Na 452 4 453 409041 AB033025 Ks.50081 

Seq ID Na 454 4455 452461 N78223 Hs.108106 

Seq ID Na 456 4 457 412420 AL035868 H5.73853 

SeqlDNa45844S9 416658 U03272 Hs.79432 

85 SeqIDNa4604461 407811 AW190802 Hs.40098 



PCTAJS02/12476 



glutamate^rystelne ligase, catalytic sub 
ESTs, Weakly similar to 117330 hypothetl 
ESTs, Weakly simHar to T17330 hypotheti 
ESTs, Weakly sbnOar to T17330 hypoM 
ESTs, Weakly 8imjlartoT17330hypoM 
ESTs. Weakly similartoT17330 hypotheti 
l^gh mobili^ sraup (noidiistDne ctvomoso 
NM.022342:Koiiio sapiens Unesbi pRdein 9 
ESTs 

presenllins associated rhomboid-like pro 
nitric oxide synthase 2A fnduclble, hep 
ATPase,Clas5VI, type Its 
torkhead box El (thyroid trsnscripSon f 
ESTs, Weakly similar to 2109260A B cell 
peptidylglycine alpha-amidating monooxyg 
hypothetical protein MGC5350 
ESTs 

unnamed protein product {Homo sai^ens] 
mNchromosome maintenance deficient (S. 
v-ets erythroblastosis virus E26 oncogen 
ESTs. Weakly similar to S41044 chromosom 
guanine nudeoSde binding protdn 1 1 
caidlonih leceptor-like 
cadhetin 5, type 2. VE-cadherin (vascula 
singed (DrosophOaHIke (sea urchin fas 
Klomo sapiens HSPC285 mRNA, partial ods 
complenient component Clq receptor 
ESTs 

protease inhibitor 3. skin^erived {SKAL 
ptakophiBn3 

RAN, member RAS oncogene family 
pflrBthyroU hornnne-like hormom 
low density Hpoprotan reoeptor-rdated 
endogenous retroviral protease 
ooHagien. type XI, alpha 1 
SRY (sax determining region Y)-box 4 



[duitBly tumcT'lFansfiDmiing 1 
insulin^ growth factor binding prole 
SRB7 (suppressor of RNA potymiarBse B, ye 
bu1yi^dt&4iiducai transcripl 1 
DuqfrelBHnAKM raisoipi 1 
sfitiD protine-rfch protein IB (comHin) 
H2A Nslone family, member X 
g^Komo sapiens luli lengUi insert cONA 
glycoprotein (transmembrane) ratii 
alpha^topFoteo) 

integrin, alpha 5 ^bronectin reoefrfor, 
matrix metaltoproteinase 10 (stromelysin 
matrbcmetaltoproteinase 1 (interstitial 
matrix metaltopreteinase 1 (interstitial 
soilutB carrieriamOy 7, (cafiaiic amino 
6ssua tadorpathwiqf Inhibitor 2 
G prol^n^upied receptor 39 
perlostin (OSF-2os) 
monokine induced by gamma Interferon 
ST4 onoofetflit tnphoUast 0iyoopR4eiii 
5T4 oncofetsi trophdbtesl glyooprotein 
cartilage oligomeric matrix protein (pse 
small induciljle cytokine subfamily B (Cy 
ESTs, Weakly simiter to S64054 hypoUieti 
UV-I protein, estrogen regulated 

KiAA18B6 protein 

hypothetical protdn FU21080 

secreted frizzled-related protein 4 

Ig fiupnfanity recx^lor LNIR 

a dishtegrin and metaOoprolanase doma 

8tanntoealc^2 

matrix metaltoproteinase 1 1 (stromelysin 
Transmembrane protease, serine 3 
collagen, type X. alpha 1 (SchmU melaph 
ESTs; hypoiheiica] protein for illlAGE:447 
gap junction protein, bete 2, 26kD (corni 
ESTs 
ESTs 
ESTs 

i>Myc target JP01 
transmembrane protease, serine 4 
Hypothetical protein. XP.051860 (KiAAl 1 9 
Hypottietical protein. XPJ051860 (KIAA1 19 
transcripflon factum 
bone morphogenetic protein 2 
Min 2 (oongenitei oontractural ara 
cysteine lont supeiliamily 1, BMP aniqgon 
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SeqiDNo:462 & 463 437852 

SeqiDNo:464 & 465 402075 

SeqtDNo:466 &467 421110 

SeqIDNo:468 & 469 451668 

SeqIDNo:47Q&471 45166B 

SeqlDKo:472 & 473 451668 

SeqlDNo:474&47S 422282 

SeqlDNo:476& 477 425852 

SeqtDNo:478 &47g 439738 

Seq!DNo:4eO&481 427747 

SeqiDNo:482 & 483 420281 

SeqlDNa484&4d5 405932 

SGqlDNo:486 &4S7 405932 

SeqIDNo:48B&489 444342 

SeqjDNo:490 & 491 421379 

SeqlDNo:492 & 493 417079 

SeqlDNo:494 & 495 430890 

S€qlDNo:496 &497 419721 

Seq)DNo:498 & 499 444471 

SeqlDNo:S]0& 501 413063 

SeqlDNo:502&S03 433800 

SeqlDNo:504 &505 452401 

SeqlDNo:506 &507 452401 

SeqlDNo:508 &50g 450001 

SeqlDNo:510&511 410407 

SeqlDNo:512& 513 309931 

SeqIDNo:514&515 412719 

8eqlDNo:516& 517 417034 

SeqlDMo:518& 519 430486 

SeqlDNo:S20& 521 413753 

8eqlDNo:522& 523 425650 

Seq(DNo:524 & 525 423673 

SeqlDNo:526 & 527 418663 

SeqlDNo:528 & 529 418663 

SeqlDNo:530 & 531 429610 

SeqlDNo:532&S33 406690 

SeqiONo:534 & 535 431846 

SeqlDNo:536& 537 422158 

SeqlDNo:538 & 539 431958 

Seq)DNo-.540 & 541 437044 

Seq]DNo:542 & 543 428484 

SeqlONo:544 & 545 429211 

SeqIDNo:546&547 417389 

SeqIDNo:548&S49 431009 

SeqlDNo:550&S51 417542 

SeqlDNo:552 & 553 449230 

SeqlDNo:S54& 555 410555 

SeqIDNo:556& 557 410555 

SeqIDNo:558 & 559 424687 

SeqIDNo:560 & 561 418462 

SeqlDNo:562 & 563 410274 

SeqlDNo:564 & 565 439606 

SeqtDNo:566& 567 404877 

SeqlDNo:568&56g 444781 

SeqIDNo:570 &571 418543 

SeqlDKo:572&573 415817 

Seq(DNo:574& 575 415617 

Se(ilDNo:576&577 415817 

Seq[DNo:578&579 415817 

SeqIDNo:580&581 415817 

SeqlONo:582 &5B3 415817 

Seq ID No: 584 a 565 421817 

Se4lDNo:588 & 587 418678 

Seq 10 No: 588 & 589 418878 

Seq (D No: 590 & 591 409420 

Seq ID No: 592 & 593 332180 

Seq 10 No: 594 & 595 408790 

Seq ID No: 596 & 597 40S790 

Seq ID No: 598 & 599 439223 

SeqlDNo:6Q0& 601 409757 

Seq ID No: 602 & 603 428969 

Seq ID No: 604 & 605 428969 

Seq 10 No: 606 & 607 428969 

Seq ID No: 608 & 609 428969 

Seq ID No: 610 & 611 450701 

Seq ID No: 612 & 613 450701 
Seq ID No: 614 & 615 414774 
Seq ID No: 616 & 617 407944 
Seq ID No: 618 & 619 407944 
Seq ID No: 620 & 621 457489 
Seq ID No: 622 & 623 429547 
Seq ID No: 624 & 625 407242 
Seq ID No: 626 & 627 407242 
Seq ID No: 628 & 629 407242 
Seq ID No: 630 & 631 444006 



DCUUIOvW 


H&2S6897 


AJ25Q717 


Ns.1355 


Z43948 


Hs.326444 


243948 


Hs.326444 


Z43946 


H&326444 






AKQ01S04 


Hs.159651 


BE246502 


Hs.9598 


AW411425 


Hs.180655 


Aifi23fi93 


HS^23494 


MM 014398 


Hs.10887 


Y15221 


Hs 103982 




H& 81134 


X54232 


(fe.2699 


111 lw_UUI will/ 


Hs.2B8fiS0 


AB020884 


Hs.11217 




H5.75184 


AI034361 


Hs13S1S0 


NM 007115 


Hs.29352 


NM 007115 


HS.293S2 


NM 001044 


H&406 


yRS839 




AW341fi83 




AUtfOlfifilQ 
nwuiQu iw 


HsA16 


NM QDB183 


'Hs.80982 




Hfi^4i<;<;i 

rK>«&*r 1 on 1 


U17760 


n9>l 99 1 f 


KlU flniQM 


H8.1925 




Ha.1605 


fvwm 1 1 uu 


H&ilfiSQ 


AMYiimn 

/UVUUI lUl/ 


UeilRQA 






M29540 






IKhd ItfBW 


LI 0343 


Hs.11 2341 


X63629 


Hs.2877 


ALj035864 


Hs.69517 


AF104032 


Hs.164601 






BE260964 


Hs.82045 


BE149762 


Hs.48956 


J04129 


Hs 82269 


QCO l<M*tW 


Hs 21 1579 




Hs.64311 


U92649 


Hs.64311 




Hs 1<i17% 


DCmU I«I9U 




AA381Ba7 


Hs.617fi2 


vf rvicw 


Hsf8S61 


NM 014400 


Hs.11950 


NM 005329 


Hs.85962 


U88967 


HS.78B67 


U88967 


H&78867 


U68967 


Hs.78867 




nStf Ouwr 


1 IPflQR7 


Uc 7RRR7 




He 78867 




ns* 1 uooDU 




nStiu/ jf 9 


KIM fmuT? 




£.l9UUO 


nS.3*HK}l 


ACi'iA^^u\ 
AriiWlOu 


He 7^7 




HedTflfiO 




UcATRfifl 




n3.^«iuoio 


MM (lAlflQR 
ncvL.uuioao 






H8.1946B9 


AF120274 


Hs.194689 


AF120274 


H&.1946B9 


AF1 20274 


Hs.194669 




HS.2884S7 


H39980 


Hs.288467 


X02419 


Hs.77274 


R34008 


Hsu!39727 


R34008 


Hs^9727 


AI693815 


Hs.127179 


AW009166 


Hs.99376 


M18728 




M18728 




M18728 




BE3950B5 


Hs.10086 



ESTs. Weakly sknilarto dJ365012.1 [H^ 

ENSP0000Q25105G^PIasmamernbianecdloliim 

catheiK^E 

cartilage acidic protein 1 
cartilage acidic protein 1 
cartilage acidic protein 1 
apoQpoprotebil 

death receptor 6. TNF supofainHy rrernber 
sema domain, mrnuinoglobulin domain (Ig), 
serine/threonine kinase 12 
Predicted cation efflux pump 
C1S00030S:g]|3806122|gb|AAG69198.1) (AFO 
C15Q00305:gl|3806122Igb)AAC69198.1) (AFO 
similar to lysosome^ociated membrane 
small inducible cytokine subfamily B (Cy 
intarleukln 1 receptor antagonist 
glyiricanl 
aqiiaporin4 
KIAAI0877 protein 

chHinase 3-Iike 1 (cartilage glyooprote 
lung typa^ cell menfibrane-^ssodaled gty 
tumor necfoais factor, al|ph»induc8d pro 
tumor necrosis factor. di)h»lnduced pro 
solute carrier family 6 {neuFotransmltie 
carbonic antiydrase IX . 
gbAd13dOlj(1 SoafBS_NFljr.6BC.S1 Homos 
ESTs 

neurotensif) 

chlortde channel, caldum activated, fern 
laminin. beta 3 (nk»in (125kD}, KaTmin 
desmoglein 3 (pemphtgus vulgaris antigen 
maMx metanoproteinasa 12 (maoophage 
deamocolltnd 
desmocoliin 3 

lUNX prolan; PLUNC (palate lung and nas 
cardnoembryonic antigen^elated ceil ad 
uoplaMnIB 

protease inhibitor 3, skbKtetivad (SKAL 
cadherin 3, type 1, P-cadherin (placenta 
differentially expressed in Fanconi^ an 
solute earner family 7 (cationlc amino 
gap junction proton, beta 5 (oonnexin 3 
midkine (neurfte growttv^romoting factor 
gap junction protan, beta 6 (connexin 3 
pro^tagen-assodated endometrial prote 
melanoma cell adheston molecule 
adisintegrin and metalkiproti^nasedoma 
a dislntegrtn and met^kiproteinase doma 
matrix metaSoproteinase 9 (getatinasd B 
integrtn, beta 4 
hypoxMducibie protein 2 • 
G proteinHXK|ded lac^itor 87 
NM.0053S5:Homo sa;^ melanoma antigen, 
GPt-anchored metastasis-assodaled pfote 
hyaluronan synthase 3 
protein ^rosine phosphatase, receptor-t 
protein tyrosine phosphatase, receptor-l 
protein tyrosine phosphatase, receptor-t 
protein tyrosine phosphatase, receptor-t 
protein ^rosine phosphatase, receptor-t 
protein tyrosine phosphatase, receptor-t 
ATP-binding cassette, sub-fiamily C (CFTR 
cancerAestis antigen (NY-ESO-1) 
cancer/testis antigen (NY-ESO-1) 
toniinin, gamma 2 (ntoeh (lOOkD). Italirii 
claudinl 

neuiotrophto tyrosine kinase, receptor, 

neurotrophic tyrosine kinase, receptor^ 

UL16 irindng protein 2 

cystatinSN 

artento 

artemin 

artemin 

artemin 

hypotheticai protein XP_098151 (leucine- 

t^otheticdprot^XP_098151 Oeudne- 

piasniinogen acSvator, uiddna&e 

desmocolfin2 

desmocoliin 2 

cryptic gene 

ESTs 

Ob:Human nonspecilic aossreacfing anfig 
gb:Human nonspecifto crossreacfing anBg 
gbrHuman nonspedfto crossreading aniig 
type I transmembrane protein Fn14 
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SeqlONo:632&633 
8eqlDNa634&63S 
8eqlDNo:636 &637 
SeqIDNo:63B&639 
S8qIDNo:640 &641 
SeqlONa642 &643 
8eqU>No;644&645 
8eq{I}No:646&647 
SeqlDKo:648&649 
8eqlDNo:650&651 
SeqlOHo:652&653 
SeqlDNa654&6SS 
SeqIDN(K656 &657 
SeqIDNo:658 &659 
SeqlDNo:660&661 
88qlDNo:662&663 
8eqlDNa664 &665 
SeqlDNo:666&667 
Seq ID No: 688 & 669 
Seq)ONo:67D&671 
Seq U> No: 672 & 673 
8eqlONa:674&675 
Seq ID Na 676 & 677 
Seq ID No: 678 & 679 
Seq ID No: 681 
Seq ID No: 682 & 683 
SeqlDNo:684&68S 
Seq K) No: 686 & 687 
Seq ID No: 688 & 689 

TABLE 15B 



429597 
422109 
419235 
449048 
419215 
431462 
448243 
426427 
445537 
422278 
428450 
446619 
453392 
426514 
425776 
425776 
431515 
4194S2 
432653 
432653 
432653 



410001 
426501 
408369 
445413 
422424 



420610 



NALO03816 Ks.2442 a dtsintegrin and mefadbproteinase doma 

S73265 Hs.1473 gastrin-releasing peptide 

AW470411 Hs^88433 neurotrimin 

Z45051 H5^820 similar to S68401 (cattle) glucose Indue 

AU076718 Hs.164021 smgll inducible cytoUne subfamily B (Cy 

AW583672 Hs.256311 . granin^ike neuroendocrine pepfidsprecu 

AW369771 Hs^2620 Integrin. beta 8 

M8669g H5.169840 TTK protein kinase 

AJ245671 Hs.12844 EGF-4ike^ain, multiple 6 

AF072673 Hs.114218 frizzled (Dn)sophila)homolog 8 

NM_014791 Hs.18433g KIAA0175 gene product 

AU076643 HsJIS secreted pliosphoprotein 1 (osteopontin. 

U23752 Hs.32964 SRY (sex determining region Y>bax 11 

BE61 6633 H5.170195 bone nwptiogenetlc proton 7 (osteogenic 

U251 28 Hs,159499 parathyroid honnone receptor 2 

U25128 Hs.159499 parafiiyroid hormone receptor 2 

NhL012152 K8J58583 endotheIIatdi»Brenfiation,ly5ophospha 

U33635 Hs.90572 PTKZpnM tyrosine kinase 7 

N62096 HS.29318S EST8.WeaMy similar to JC7328 amino aci 

N62096 Hs^185 ESTs, Weakly sirdar to JC7328 amino ad 

N62096 H&^3185 ESTs. Weakly similar to JC7328 amino ad 

N62096 H5;53185 ESTs, Weddy similar to JC732B amino ad 

AB041038 HS57771 kdikreinll 

AW043782 Hs.293616 ESTs 

R38438 Ks.182575 sdute carder family 15 (H777 transport 

AA151342 H&12677 Cai47pn)lBin 

AI18&431 Hsi9663B prostate dBterent^ factor 

L22S24 Ks^ netrix metaltoproteinase 7 (mabUysin, 

AI663183 H&99348 dIstaUesshomeoboxS 



Pkey: UmqueEosprabesetidanlilierinnriier 
GATmmiben Gene duster number 
Acc essi o n : Genbank accession numbtts 



Pkoy GATNumbar 

309931 AW341683 

330493 33264J5 

439285 47065.1 



450375 



451320 86576L1 



Axesston 

M27826 R78416 AA307645 AWg57879 AW957800 AA633529 H03662 

AL133916 N79113AF086101 N76721 AW950828 AA364013 AW955684 AI346341 A1867454 N54784AI655270A1421279AW014882 
AA77S552 N62351 N59253AA626243AI341407 BE175639AA456968A1356918AA457077 

AA009647AA1312S4AA374293AW^5H04410AW6062B4AA151166BE157467BE157601 H04384 W4629^ H01532 

AAig0993 H03231 H59605 H01642 AA852876AA1 13758 AA628915AA746952AI161 014 AA099S54 R69087 

AW1 1 8072 AI63ig82 T15734 AA224195 AI701458 W20198 F26326 AA690570 N80S52 AW071907 AI6713S2 A1375892 T03517 R88265 

AI124088AA224388AI084316At3546e8T33652A1140719AI72Q211T03490AI372637T15415AW205836AA630384TO^ 

AA017131AA443303 T33623AI222556T33S11T337B5AI419606DS5612 



TABl£15C 

Rtey: Urdque number corresponding to an Eos prebeset 

Ref: Sefienoesouioe. The7diG^iumibei8in1hlsoohimnarBGenbankldent!lier(Gi)num "Dunham Letai; refers to iliepubficatton 

sequence of human dvomosoma 22.* Dunham I. el aL. Nature (1999) 402:48M95. 
Straixfc indicates DNA strand fipom which axons were pradkited. 
NLposItton: Indteates nudeofide posffions of prB(Sded eooms. 



ThsDNA 



Pkey 

402075 

403329 

403478 

404440 

404877 

405770 



Ref 

8117407 
8516120 



7528051 
1519284 
2735037 
7767812 



Strand 

Plus 

Phjs 

Plus 

Plus 

Plus 



121907.122035.122804-122921,124019.124161.1244S5-124610,125672.126076 

96450-96598 ' 

116458-116564 

8043041581 

1095-2107 

61057^5 

123S25-123713 
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Table 16 



Seq ID MO: 1 DNA sequence 

Nucleic Acid Accession #t mm_001216 

Coding sequence: 43.. 1422 



1 11 21 31 41 51 

I I 1 1 I i 

GCCCGTACAC ACC3GT6TGCT GGGACACCCC ACAGTCAGCC GGATGGCTCC CCTGTGCCCC 60 

A6CCCCTGGC TCCCTCTGTT GATCCCGGCC CCTGCTCCAG 6CCTCACTGT GCAACTGCTG 120 

CT6TCACTGC TGCTTCTGAT GCCTGTCCAT CCCCAGAGGT TGCCCCGGAT GCAGGAGGAT 180 

TCCCCCTTOO GAGOAQGCTC TTGTGGGGAA GATGACCCAC TGGGCGA66A GGATCTGCCC 240 

AGTOAAGAGQ ATTCACXraWS AGAGGAGGAT CCACCXX3GAG AGGAGGATCT ACCTGGAGAQ 300 

GA6GATCTAC CTGGRGAGGA GGATCTACCT GAAGTTAAGC CTAAATCAGA AGAAOftGGGC 360 

TCCCTGAAGT TAGAGGATCT ACCTACTGTT GAGGCTCCTG GAGATCCTCA AGAACCCCAO 420 

AATAATGCCC ACAGGGACAA AGAAGGGGAT GACCAGAGTC ATTGGCGCTA TGGAQGCGAC 480 

CCGCCCTGGC CC0G68TGTC CXCAGCCTGC GCGGGCCGCT TCCAGTCCCC GGTGGATATC 540 

OSCCCCCAGC TCGOCGCCTT CTGCCCX3GCC CTGCGCCCCC TGGAACTCCT GGGCTTCCAG 600 

CTC5CCGCCGC TCCCAGAACT GC6CCTGCGC AACAATGQCC ACAGTGTGCA ACTGACCCTG 660 

CCTCCTGGGC TAGAGATGGC TCTGGQTCCC GGGOGGGAGT ACCGGQCTCT GCAGCTGCAT 720 

CTGCACTGGG GGGCTGCAGG TCGTCCGGGC TCGGAGCACA CTGTQ6AAGG OCACCGTTTC 780 

CCTGCGGAGA TCCACGTGGT TCACCTCAGC ACCGCCTTTG CCAGAGTTGA CGAGGCCTTG 840 

GGGCGCCCG6 GAGQCCTGGC CGTGTTGGCC 6CCTTTCTGG AGGAGGGCCC GGAAGAAAAC 900 

AGTGCCTATG A6CAGTTGCT QTCTCGCTTG GAAGAAATCG CTGAGGAAGG CTCAGAGACT 960 

CA6GTCCCAG GACTGGACAT ATCTGCACTC CTGCCCTCIG ACTTCAOCCG CTACTTCCAA 1020 

TATGAGGGGT CTCTGACTAC ACXX3CCCTGT GCCCAGGGTG TCATCTQGAC TGTGTTTAAC 1080 

CAGACAGTOA TGCTGAGTGC TAAGCAGCTC CACACCCTCT CTGACACCCT GTGGGGACCT 1140 

GGTGACTCTC GGCTACAGCT GAACTTCCGA GCGACGCAGC CTTTGAATGG GCX3AGTGATT 1200 

GAGGCCTCCT TCCCTGCTGG AGTGGACAGC AGTCCTCGGG CTGCTGAGCC AGTCC AGCTG 1260 

AATTCCTGCC TGGCTGCTGG TGACATCCTA OCCCTOOTTT TTGGCCTCXSP TTTTQCIGTC 1320 

ACCAGCGTOS CGTTCCTTGT 6CAGAT6AGA AGGCAGCACA GAA6666AAC C3kAAGGGGST 1380 

GTGAGCTACC GCCCAGCAGA GGTAGCCX3AG ACTOGAGCCT AGAGGCTGGA TCTTOGAGAA 1440 

TGTGAGAAGC CAGCCAGAGG CATCTGAGGG GGAGCCGGTA ACTGTCCTGT CCTGCTCRTT 1500 
ATGCCACTTC CTTTTAACTQ CCAAGAAATT TTTTAAAATA AATATTTATA AT 



8eq ID NO: 2 Protein sequence: 
Protein Accession #: NP_001207 , 

I 11 21 31 41 SI 

HAPLCPSPmi PLZiIPAFAPG LTVQLLL8LL LLMPVHPQSL PRMQEDSPLG GGSSGEDDPL 60 

6BEDLPSEED 8PREBDPPGE EDLFGEEDLP GEEDLPEVKP K5EEEGSLKL HDLPTVEAPG 120 

DPQEPQliniAH RDKEGDDQSH WRYGCTPPWP RVSPACAGRF QSPVDIRPQL AAFCPA LRPL IBO 

SLLGFQIiPPXi FELRIiRNNCai SVQLTLPPGL EMALGFGRBY RALQLELHWG AAGRF6SEBT 240 

VEOBRFPABX BWHLSTAFA RVDBALORPG GLAVXAAFLE EGPEEBSAYE QXiLSRLBBIA 300 

BEGSETQVPG LDISALLPSD FSRYFOYEGS IiTTFPCAQGV IHTVFNQTVN LSAKQLHTLS 360 

DTIiWGPGDSR LOLNFRATQP LNSRVIEASF PAOVDSSPSA ABFVQWSCL AAGDILALVF 420 
OLLFAVTSVA PLVQMRRQHR RGTKGSVSYR PABVAEXGA 

Seq XO HOt 3 DNA sequence 
nucleic Acid Accession #x BC013923 
Coding sequence: 438-1391 

1 11 21 31 41 51 

1 I I t . I I 

AOOGOGGTTG TCTATTAACT TOTTCAAAAA GTATCAQOAO TTGTCAAGGC AGAGAAGAGA 60 

GTGTTTGCAA AAGGGGGAAA GTAGTTTQCT GCCTCTTTAA GACTAGGACT GAGAGAAAGA 120 

AGAGGAGAGA GAAAGAAAGG GAGAGAAGTT TGAQCCCCAG GCTTAAGCCT TTC CAAAAAA 180 

TAATAATAAC AATCATOGGC GGCGGCAGGA TOGGCCAGAG GAGGAGGGAA GCGCTTTTTT 240 

TGATCCTGAT TCCAOTTTOC CTCTCTCTTT TTTTCCCCCA AATTATTCTT CGCCTGATTT 300 

TCCTOGOGQA GCCCTGOGCT CCOOACACCC CCGCCCGCCT CCCCTCCTCC TCTCCCCCCO 360 

CCCGCGGGCC CCCCAAAGTC OCGGCCGGGC CGAGGGTCX3G CGGCCGCCG6 CGGGCOGGGC 420 

CCGCGCACA6 CGCCCGCATG TACAACATGA TGGAGACG6A 6CTGAAGOCX3 COOaaOOOOC 480 

A6CAAACTTC GGGGGGCGGC GGOGGCAACT CCACCGCG6C GGCGGCOGGC GGCAACCAGA 540 

AAAACAGCCC GGACOGCGTC AAGCGGCCCA TGAATGCCTT CATGGTGTGG TCCOGOGGGC 600 

AGCGGC6CAA GATGGCCCAG GAGAACCCCA AGATGCACAA CTCGQAGATC AGCAAGCGCC 660 

TGGGCGCCGA GTGGAAACTT TTGTOGGAQA C6GAGAA60G GCCGTTCATC GACGAGGCTA 720 

AGCGGCTGOG AGCGCTGCAC ATGAAGGAGC ACCOSOATTA TAAATACOGG COCOSGOSGA 780 

AAACCAAGAC GCTCATGAA6 AAGGATAAGT ACACGCTGCC CGGOGGGCTG CTGGCOCCCG 640 

GCGGCAATAG CATGGCGAGC GGGGTCQGGG TGGGCGCCGG CCTGGGCGCG GGCGTGAACC 900 

AGCGCATGGA CAGTTACGOS CACATGAACG GCTGGAGCAA CGGCAGCTAC AGCATGATGC 960 

AGQACCAGCT GGGCTACCCG CA6CACC0GG GCCXCAATGC GCACGGCGCA GCQCAGATGC 1020 

AGCCCATGCA CCGCTACGAC GTOAOGGGCC TQCAGTACAA CTCCATGACC AGCTTOCAGA 1080 

CCTACATGAA CGGCTCGCCC ACCTACAGCA TGTCCTACTC GCAGCAGGGC ACCCCTGGCA 1140 

TGGCTCTTGG CTCCATGGGT TCGGTGGTCA AGTCC6AGGC CAGCTCCAGC CCCCCTGTGQ 1200 

TTACCTCTTC CTCCCACTCC AGGGCGCCCT GCCAGGCCGG OGACCTCCGG GACATGATCA 1260 

GCATGTATCT CCCCGGCGCC GAGGTGCGGG AACCCGCCGC GCCCAGCAGA CTTCACATGT 1320 

CGCAGCACTA CCAGA6CGGC CCGOTGCOOG GCAOQQOCAT TAAGQGCACA CTGCCCCTCT 1380 

CACACAT6TG AGGGCCGGAC AGCGAACTGG A6GGGGQAGA AATTTTCAAA GAAAAAGGA6 1440 

GGAAATGGGA GGGGTGCAAA AGAGGAGA6T AAGAAACAGC ATGGAGAAAA CCCGGTACGC 1500 

TCAAAAAAAA AAAAAAAAAA AAAATCCCAT CACCCACAGC AAATGACAGC TGCAAAAGAG 1560 

AACACCAATC CCATCCACAC TCACGCAAAA ACCGCGATGC OGACAAGAAA ACTTTTATGA 1620 

GAGAGATCCT GQACTTCTTT TKQGG6GACT ATTTTTQTAC AGAGAAAAOC TGGGGAGGGT 1680 

GG6GAG6GCG GGGGAATGGA CCTTGTATAO ATCTGGftGGA AAGAAAGCTA CGAAAAACTT 1740 

TTTAAAAGTT CTAGTGGTAC GGTAGQAGCT TTGCAGGAAG TTTGCAAAAG TCTTTACCAA 1800 

TAAXATTTAG AGCTAGTCTC CAAGCGACQA AAAAAATGTT TTAATATTTG CAAGCAACTT 1860 

TTGTACAGTA TTTATOGAGA TAAACATGGC AATCAAAATO TCCATTGTTT ATAAGCT6AG 1920 
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AATTTGCCAA TATTTTTCAA GGAGA6GCTT CTTGCTGAAT TTTGATTCTG CAGCTGAAAT 1980 

TIAG6ACAGT TGCAAAOSTG AAAAGAAGAA AATTATTCAA ATTTGGACAT TTTAATTGTT 2040 

TAAAAATTGT ACAAAAGGAA AAAATTAGAA TAAGTACTGO OQAACCATCT CTGTGGTCTT 2100 

GTTTAAAAAG GQCAAAAGTT TTAGACTGTA CTAAATTTTA TAACTTACTG TTAAAAGCAA 2160 

AAATGGCCAT GCAGGTTGAC ACCGTTGGTA ATTTATAATA GCTTTTOTTC QATCCCAACT 2220 

TTCCATTTTG TTCAGATAAA AAAAACCATG AAATTACTGT GTTTQAAATA TTTTCTTATG 2280 

GTTTGTAATA TTTCT6TAAA TTTATTGTOA TATTTTAAGG TTTTCCECCC TTTATTTTCC 2340 

GTAGTTGTAT TTTAAAAGAT TOGGCTCTOT ATTATTrGAA TCAGTCTGCC GAGftATCCAT 2400 

GTATATATTT GAACTAATAT CATCCTTATA AC3«X3TACAT TTTCAACTTA AGTTTTTACT 2460 

CCATTATGCA CAGTTTGAGA TAAATAAATT TTTGAAATAT GGACACT6AA AAAAAAAAAA 2520 

AAAAAAACAA AACAAAAAAA CAAAAAACAA AAACAGAAAA AACAAAAAAA AAAACAAAAC 2580 

GACSU^CACAA AAACAAAAAA AAAAAAAAGA AACAAACACA CAACACAACA CAACACAAAA 2640 
CCACAACAC31 AACAACAACA CACAGAGGG 

Seq ID NO: 4 Protein sequence: 
Protein Accession #:CAA83435.1 

1 11 21 31 41 

I I i i I 

MYNMMETEIjK PPGPQQTSGG GGCaiSTAAAA GGNQKNSPDR V KRPMKA FMV 
QENPKMHNSE I8KRLGAEWK LLSETEKRPF IDEAKRLRAL HMKEHFDYKY 
KKDKYTLPG6 IiIiAPGGNSMA SGVGVGAGLO AGVKQRMDSY AEMNGWSNGS 
PQBF6LNAHG AAQMQPMHRY DVSAIiQYNSM TSSQTYMNGS PTYSMSYSQQ 
QSWKSEA5S SPPWTSSSH SRAPOQAGDL RSMISKYLPG AEVFEPAAPS 
OFVPOTAZNO TLPLSBM 

Seq ID HO: 5 DNA seqiience 
Nucleic Acid Accession #: U9161B 
Coding sequence: 29-541 

1 11 21 31 41 51 

OGGACTTGGC TTGTTAGAAG GCTGAAA6AT GATGGCAGGA ATGAAAATCC AGCTTGTATG 60 

CATGCTACTC CTGGCTTTCA GCTCCTGGAG TCT0T6CTCA 6ATTCA6AA6 AGGAAATGAA 120 

AGCATTAGAA GCAGATTTCT TGACCAATAT GCATACftTCA AAGATTAGTA AAGCAGATGT 180 

TCXXnCTTGG AAGATGACTC TGCTAAATOT TTGCAtSTCTT GTAAATAATT TGAACA6CCC 240 

AGCTGAGGAA ACAGGAGAAG TTCATGAAGA GGAGCTTGTT GCAAGAAGGA AACTTCCTAC 300 

TGCTTTAGAT GGCTTTAGCT TGGAAGCSU^T GTTGAC3VATA TACCAGCTCX: ACAAAATCT6 360 

TCACRGC3VGG GCTTTTCAAC ACTGGGAGTT AATCCAGGAA GATATTCTTG ATACTGGAAA 420 

TGACAAAAAT GQAAAGGAA6 AAOTCATAAA GAGAAAAATT CCTTATATTC TGAAAOQQCA 480 

QCT0TAT6A6 AATAAACCCA GAAGACCCTA CATACTCSUUi AOAflATTCTT ACTATTACTO 540 

AGAGAATAAA TCATTTATTT ACATGTGATT GTGATTCATC ATCCCTTAAT TAAATATCAA 600 

ATTATATTTG TGTGAAAATG TGACAAAC3\C ACTTATCTGT CTCTTCTACA ATTGTGGTTT 660 

ATTGAATGTG TTTTTCTGCA CTAATAGAAA TTAGACTAAG TGTTTTCAAA TAAATCTAAA 720 
TCTTCAAAAA AAAAAAAAAA AAATGGG6CC GC3UVTT 



Seq ID NO: 6 Protein sequence: 
Protein Accession 9: AAB50564 

I 11 21 31 41 51 

II I I I ) 

MMAGMKIQLV CHLZiIiAFSSW SLC8DSBBEM KALEADFLTN MHTSKISKAH VFSWKMTLLK 60 

VCSLVNNLNS FAEBT6EVBE EEIiVARRKLP TALDGPSUBA MLTIYQLHKI CHSRAFQHHE 120 
LIQEDILDT6 NDKNGKEBVI KRKIFYILKR QLYENXFRRF THiRRDSVyY 

Seq ID NO: 7 DMA sequence 

Kucleic Acid Accession #: NH_006536.2 

Coding sequence: 109-2940 

1 11 21 31 41 51 

I I I I I I 

ACCTAAAACC TTGCAAGTTC AGGAAGAAAC CATCTGCATC CATATTGAAA ACCTQACACA 60 

ATGTATOCAG CAGGCTCAGT GTGAGTGAAC TGQAG6CTTC TCTACAACAT GACCCAAAGG 120 

AGCATT6CAG QTCCTATTTG CAACCT6AAG TTTGTGACTC TCCTGGTTGC CTTAAGTTCA 180 

GAACTCCCAT TCCTGGGAGC TGGAGTACAG CTTCAAGACA ATGGGTATAA TGGATTGCTC 240 

ATTGCAATTA ATCCTCAGGT ACCTGAGAAT CAQAACCTCA TCTGAAACAT TAAGQAAATG 300 

ATAACTGAAG CTTCATTTTA CCTATTTAAT GCTACCAAGA ^AGAGTATT TTTCAGAAAT 360 

ATAAAGATTT TAATACCTGC CACATGGAAA GCTAATAATA ACAGCAAAAT AAAACAAGAA 420 

TCATATGAAA AGGCAAATGT CATAGTQACT GACTGGTATG GGGCACATGG AGATGATCCA 480 

TACACCCTAC AATACAGAGG GTGTGGAAAA QAGGGAAAAT ACATTCATTT CACACCTAAT 540 

TTCCTACTGA ATGATAACTT AACAGCTQOC TACGGATCAC GAGGCCSAGT GTTTOTCCAT 600 

GAATGGGCCC ACCTCCGTTG GGGTGTOTTC GATGAGTATA ACAATGACAA ACCTT TCTAC 660 

ATAAATGGGC AAAATCAAAT TAAAGTGACA AGGTGTTCAT CT6ACATCAC AGQCATTTTT 720 

GTGTGTGAAA AAGQTCCTTG CCCCCAAGAA AACTQTATTA TTA6TAAGCT TTTTAAAQAA 780 

GGATGCACCT TTATCTACAA TAGCACCCAA AATGCAACTO CATCAATAAT GTTCATGCAA 840 

AGTTTATCTT CTOTGGTTGA ATTTTGTAAT OCAAOTACCC ACAACCAAGA AOCACCAAAC 900 

CTACAGAACC AGATGTGCAG CCTCA6AAGT 6CATGQ6AT6 TAATCACAGA CTCT6CTGAC 960 

TTTCACCACA GCTTTCCCAT QAATGGGACT 6AGCTTCCAC CTCCTCCCAC ATTCTCGCTT 1020 

GTACAGGCTG GTGACAAAGT GGTCTGTTTA GTGCT6GATG TGTCCAGCAA GATGGCAGAG 1080 

GCTGACAGAC TCCTTCAACT ACAACAAGCC GCAGAATTTT ATTTGATGCA GATT6TTGAA 1140 

ATTCATACCT TCGTG66CAT TGCCAOTTTC GACAGCAAAO GAGAGATCA6 AGCCCAGCTA 1200 

CACCAAATTA ACAGCAATGA TQATCGAAAG TTGCTGGTTT CATATCT6CC CAC CACTG TA 1260 

TCAGCTAAAA CAGACATCAG CATTTGTTCA GGGCTTAAGA AAGQATTTGA GGTGGTT6AA 1320 

AAACTGAATG GAAAAGCTTA TQGCTCTGTG ATGATATTAO TGACCAGCGG A6AT6ATAA6 1380 

CTTCTTGGCA ATTGCTTACC CACTGTGCTC AGCA6T6GTT CAACAATTCA CTCCATTGCC 1440 



51 
I 

HSRGQIlXtXHA 60 

RPRKKTKXLM 120 

YSMMQDQLGY 180 

GTPGMALGSM 240 

RURMSOHYQS 300 
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PCTAJS02/12476 



5 
10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



CTG66TTCAT 
TTCTTTOTTC 
TCTG6AACTG 
AAACCTCACC 
ATGTTTCTA6 
GGAGGftAAAT 
TGGATTCCAG 
TCrCTGCAAG 
GCCACTGTGG 
TATGCCAATG 
QAGCCAGAGA 
GTTArCAAAAA 
TATA6CTTGA 
CCAGGQAGTC 
QCTCGAAGQA 
AGCTCA66A6 
CCACCATGCA 
TG6ACA6CAC 
A6TAAAAQTC 
AA6CX?AAATC 
AC6AATGGAC 
GCAATAC6AG 
CCTCTGTTTA 
GGAQTTTTAA 
CATACTTTAA 
ATAAATATCC 
CATACTAACA 
ATAC3U3ATAA 
CCTTACAC!TT 
6CAAAGGGAA 
AATAGCCCCA 
TCATTTAGTT 
TTTACATGAA 
CTTGCTATTT 
TTTCACTGTA 
TTTATGACAA 
TTTCTAAGTT 
TACCTAQQAA 



CTGCAGCCCC 
CAGATATATC 
GAGACATTT7 
ATCAATTGAA 
TTACGTGGCA 
ACTACACAAA 
GAACAGCTAA 
C0CT6AAA6T 
AAGCCTTTGT 
TGAAACAGGG 
CTGGAGATCC 
AT6AT6GAAT 
AAGTGCATGT 
ATGCTATGTA 
AATCAGTAGG 
GCTCCTTTTC 
AAATTATT6A 
CTGGA6AAGA 
TACA6AATAT 
CTCAGCAA6C 
CTGAACATCA 
CAATQGATAQ 
TTCCCCCCAA 
CAGCAATGGG 
GCAGGAAAAA 
AAAGTGTCTT 
AAGTCAAATT 
GATTTTTACA 
T66CTATGAA 
GQGTAAAGTC 
A6CAGAGAAA 
ACTTTGATTA 
GATCAT6CTA 
TGTTATATAT 
AGAGGTAACX: 
AGGTCTATTG 
TATtOCCTTQ 
A 



AAATCTGGAG 
AAACTCCAAT 
CCAGCAACAT 
AAACACA6T6 
GGCCAGTGGT 
TAATTTTATC 
GCCTGG6CAC 
GACAGTGACC 
GGAAAGAGAC 
ATTTTATCCC 
T6TTACGCT0 
TTACTOGAGG 
CAATCACTCT 
TGTACCAGGT 
CAGAAATGAG 
AGTGCTGGGA 
CCTGGAAGCT 
CTTTGATCAG 
CCAAGAT6AC 
TGGCATCAGG 
GCCAAATGGA 
fflVACTCCTTA 
TTCTQATCCT 
TTTGATAGGA 
GAQA6CAGAC 
CCTTCTTAGA 
AACATCAAAA 
TG6TAGATCA 
CAAATAATAA 
GGACCAGTGT 
AGGAGGGTAG 
ATTTTTCPrT 
TATTTTATAT 
ATTTCAQATG 
TTTAACAATA 
AATTTATTTG 
GGTTATTATG 



GAATTATCAC 
AGCATGATTG 
ATTCAGCTTG 
ACTGTGGATA 
CCTCCTGA6A 
ACCAATCTAA 
TGiGACTTACA 
TCTCGCGCCT 
AGCCTCCATT 
ATTCTTAAT6 
AGACTCCTTG 
TATTTTTTCT 
CGCAOCATAA 
TACACAGCAA 
GAGGAGCGAA 
6TTCCAGCT6 
GTAAAAOTAO 
GGCCAGGCTA 
TTTAACAATG 
GAGATATTTA 
GAAACACATG 
CAGTCTQCTG 
GTACCTGCCA 
ATCATTTOCC 
AAGAAAGAGA 
TATAAGACCC 
CTGTATTAAA 
ACAATTCTTT 
AAATTATTCT 
CAAGGAAAGT 
GTCTGCATTA 
TCTCCTTATC 
ATQTAGCCCC 
ACATCTCCCT 
TGOGTATTAC 
TNTGTAAGTT 
QAATOATAGX 



GTCTTACAGG 
ATGCTTTCA6 
AAAGTACAQG 
ATACTGTGGG 
TTATATTATT 
CTTTTOGGAC 
CCCTBAACAA 
CCAACTCAGC 
TTCCTCATCC 
CCACTGTCAC 
ATGATG6AGC 
CCTTTGCTGC 
GCACCCCAAC 
ACGGTAATAT 
AGTGGGGCTT 
GCCCCCACCC 
AAOAGGAATT 
CAAGCTATGA 
CTATTTTAGT 
GGTTCTCACC 
AAA6CCACAG 
TATCTAACAT 
6AGATTATCT 
TTATTATAGT 
ATGGAACAAA 
ATGGCCTTGG 
AT6CATTGAG 
TTGGGG6TAG 
TTAAAGTAAT 
TTGTTTTATT 
TAACTGTCTG 
TGTGCAGTAC 
TAATGCAAAG 
GCTAATGCTC 
CTTTGTCTCT 
TCTACTCCCA 
7ATA00CCC3I 



Seq ID NO: 8 Protein sequence: 
Protein Accession NP_006S27.1 



MTQRSZAGFI 
XKEMITEASP 
GDDPYTLQYR 
KPPYINGQNQ 
MFMQSXjSSW 
TPSLVQAGDK 
SAQLHQINSK 
GDDKLLGtrCL 
SRISSGTGDI 
FDPDGRKYYT 
AVPPATVEAF 
AQADVIKNDG 
lOKNAFRKSV 
LTLSWTAPGB 
PQISTNGPEH 
LILKGVLTAM 



11 

I 

OniKFVTUiV 
YLBHATKRRV 

GCXSKBGKYIH 
IKVTRCSSDI 
EFCNASTHNQ 
WCLVIJ>VSS 
DORKLLVSYL 
PTVLSSGSTI 
PQQHIQLEST 
NNFZINLTFR 



I7SRYFFSFA 
GR23EEERKHG 
DFDQGQATSY 
QPNGBTHB8H 
GI.IGZZCLII 



21 
I 

AltSSBLFFLO 
FFRNIKILIP 
PTPNEUJIDN 
TGIFVCEKGP 
EAPNLQHQHC 
KI4AEADRLLQ 
PTTVSAKTDI 
HSIAt^SSAA 
GENVKPHEQI. 
TASLWIP6TA 
PVMZYANVKQ 
ANGRYSLKVH 
FSRVSSGG8F 
BIRMSKSLQN 
RIYVAIRAem 
WTHHTLSRK 



31 

I 

AGVQLQDEIGY 
ATWKANNNSK 
LTAGYGSRGR 
CPQENCIISK 
SLRSANDVZT 
LQQAAEFyiH 
SICSGLKRC^ 
PNItEELSRLT 
KNTVTVDNTV 
KPGHWTYTm 
GPypIUIATV 
VNRSPSZSTP 
SVLGVPAGPH 
IQDDFNNAIL 
RNSLQSAVSN 
KRADKECEMP8T 



41 

1 

ngliiIazhfq 
ikqesyekan 

VFVEEWAHLR 
LFKEGCTFZY 
DSADFBHSFP 
QIVBIHTFVG 
EWEKLNGKA 
GGLKPFVPDI 
GNDTMFLVTW 
immSIjQALK 
TATVEPETGD 
AHSIP6SHAM 
PDVPPPCKZZ 
VNTSKRNPQQ 
XAQAPLFIPP 
KLL 



A6GTTTAAA6 


1500 


TAGAATTTCC 


1560 


T8AAAATGTC 


1620 


CAACGACACT 


1680 


TQATCCTQAT 


1740 


AGCTAGTCTT 


1800 


TACCCATCAT 


1860 


TGT6CCCCCA 


1920 


TGTGATQATT 


1980 


TGCC3«»GTT 


2040 


AGGTGCTGAT 


2100 


AAAT6GTAGA 


2160 


CCACTCTATT 


2220 


TCAGATGAAT 


2260 


TAGCCGAGTC 


2340 


TGATQTGTTT 


2400 


GACCCTATCT 


2460 


AATAAOAATG 


2520 


AAATACATCA 


2560 


CCAGATTTCC 


2640 


AATTTATGTT 


2700 


TGCOCAOGOG 


2760 


TATATTGAAA 


2820 


TGTGACACAT 


2880 


ATTATTATAA 


2940 


ACTACAAAAA 


3000 


TTTTTGTACA 


3060 


ATTAGAAAAC 


3120 


GTCTTTAAAG 


3180 


GAGGTGGAAA 


3240 


TGTGAAGCAA 


3300 


AGGTTGCTTG 


3360 


CTCTTTACCT 


3420 


AGAGATCTTT 


3480 


TCATACCGGT 


3540 


TCAAAGCAGC 


3600 


TATAATGCCT 


3660 


51 




1 

VPENQNLISN 


60 


VZVTDttYGAH 


120 


WGVFDEYNND 


180 


NSTQNATASI 


240 


MKGTSLPPPF 


300 


ZASFDSKGBZ 


360 


YGSVNZLVTS 


420 


8NSNSMIDAF 


480 


QASGPPEllh 


540 


VTVTSRASNS 


600 


FVTIALU30G 


660 


yVPGYTANGH 


720 


DIiEAVKVEEE 


780 


AGIREIPTFS 


640 


NSOPVPARDY 


900 



Seq ID KOt 9 DMA sequence 

Nucleic Acid Accession #: Eos sequence 

Coding sequence: 336-632 



CTCCCCTCAC 
CCTGG6TGGG 
6AGCTGGCAC 
CAGGGTTTGG 
CCAGTGGGGC 
GGGTCTGTCT 
CGCTG6CTGT 
AGCTGAGTAA 
AGAAA6TGGA 
AGGAGGTGSA 
ACTTCTTCCA 
TCTCTTGGGC 
TGTTGATAAT 
CTGGGAGAT6 
TCTCCAAGGC 
ATTGGftAATC 
AAATACCA 



11 

I 

CCCGGTCCAG 
CTCAGGGGCT 
TCTCTGGGAG 
TGGGATCAGG 
CCACATATAA 
CTGCCACCTG 
GCTGGTCACT 
GGGGGAAATG 
TGA6GAGGG6 
CTTCCAGGAS 
GGGCTGCCCA 
CCAGGACTGT 
ATTTTAATTG 
AGGGCCTCCT 
CA6A6CTAT0 
GAGATAGGTT 



21 

1 

GATGCCCAGT 
GCCCTTGACC 
GGAGGGGGCT 
TTGAGGCAGG 
ATCCTCACCC 
GTCTGCCACA 
ACCTTCCACA 
AAGGAACTTC 
CTGAA6AA6C 
TATGCTGTTT 
GACCGACCCT 
TGATGCCTTT 
CTCAGTGATG 
GGATCCTGCT 
CTTTAGGTCT 
GCTGACTTTT 



31 

! 

CCCCAC3GACA 
T6GCCTA6A6 
GGGAGGGAAT 
TTTGGTTTCC 
TGGGAGCCTG 
GATCCATGAT 
AGTACTCCTG 
TGCACAAGGA 
TG ATGG GCAG 
TCCTG6CACT 
6AAGCAGAAC 
GAGTTTT6TA 
TTCG^TAACC 
CCCTTCTQG6 
CAATTTTQGA 
ATITTGTCAA 



41 
1 

CCTCCCACTT 
CCCTCCCCCti 
OAGTGGGAAT 
TTAAAATGCC 
GCTGCCTTGC 
GTGCA6TTCT 
CCAAGAGGGC 
GCTGCCCAGC 
CCTGGATGAG 
CATCACTCTC 
TCTTGACTTC 
TTCAATAAAC 
CGGCTGGCTC 
CTCTGACTCT 
ATTTCAAACA 
ATAAAGATAT 



51 

I 

CCCACTGTGG 
GCTGGTGGTG 
GGCAA6AGGC 
AAGTTGGGGG 
TCTCCTTCCT 
CTGGAGCAGG 
GACAAGTTCA 
TTTGTGGGGG 
A ACAG TGACC 
AT6T6CAAT0 
CT6CCATGGA 
TTTTTTTGTC 
AGCTGGAGTQ 
CCTG6AAATC 
CCAOCAAAAA 
TAAAAAAGGC 



60 
120 
160 
240 
300 
360 
420 
460 
540 
600 
660 
720 
780 
640 
900 
960 



Seq ZD NO: 10 Protein sequence: 
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Protein Accession #s NP_005969.1 

1 H 21 31 41 51 

I i I 1 i I 

MMCSSLEQAL AVIiVTTFHKY SCQEGDKFKL SKOEMKELLH XELPSFVGEK VDEE6LKKLH 
GSLDEHSDQQ VDFQEYAVFL ALITVMCNDF FQGCPDRP 



9eq ID NO I 11 DNA sequence 

Nucleic Acid Accession #: E08 sequence 

Coding sequence: 336-626 

1 11 21 31 41 51 

11)111 
CTCCCCTCAC CCOGGTCCAG GATGCCCAGT CCCCAC6ACA CCTCCCACTT COCACTGTGO 
CCTGGGTGGG CTCAGQGGCT GCCXTTTGAOC TGGCCTAQAO CCCTCCOCCA GCTGGTGGTG 
GAGCTGGCAC TCTCTGGGAG GGAGGGGGCT GGQAGGGAAT QAGTGGOAAT GGCAAGAGGC 
CAGG6TTTGG TGGGATCAGG TTGAGGCAGG TTTGGTTTCC TTAAAATGCC AAGTTGGGGG 
CCAGTGGGGC CCACATATAA ATOCTCACCC TGGGAGCCTO GCTGCCTTGC TCTCCTTCCT 
GGGTCTGTCT CTGCCACCTQ GTCIOCiCACA QATCCATGAT QTGGAOTTCT CTGGA6CAGG 
CGCTGGCTGT GCTGGTCACT ACCTTCCACA AGTACTCCTG CCAAGAGGGC GACAAGTTCA 
AGCTQAGTAA GGOGGAAATG AAGGAACTTC TGCACAAGGA GCTGCCCAGC TTTGTGGGGC 
ATTCCAGAGA ACCATGTGCT GTGAGGGCCT TCCGAGTCXaV TCTGTTTAAT CCTGTCATTO 
GAGACTTGAG AAAGCAGAGC CCAGAAGGGA AAAGTQATTG TCCCAAGATC ACACAGCACT 
GGAGAAAGTG GATQAGGAGG GGCIGAAGAA GCTGATGGGC A6CCTGGATG AQAACAGTGA 
CCAGCAGGTO 6ACTTCCAG6 AGTATGCTGT TTTCCTGGCA CTCATCACTG TCATOTGCAA 
TGACTTCTTC CAGGGCTGCC CAGACCXSACC CTQAAGCA6A ACTCTTQACT TCCTQCCATG 
GATCTCTTGG GCCCAGGACT GTTGATGCCT TTGAGTTTTG TATTCAATAA ACTTTTTTTG 
TCTGTTGATA ATATTTTAAT TGCTCAGTGA TGTTCCATAA CC06GCTGGC TCAGCTGGAG 
TGCTGG6AGA T6AGG60CTC CTGGATCCT6 CTCCCTTCTG GGCTCTOACr CTCCT6GAAA 
TCTCTCCAAG GCCAGAGCTA TGCTTTAQGT CTCAATTTTG GAATTTCAAA CACCAGCAAA 
AAA7TGGAAA TCGAGATAGG TTGCTGACTT nATTTTOTC AAATAAAGAT ATTAAAAAAO 
GCAAATACCA 



Seq ID MO t 12 Protein sequence i 
Protein Accession Bos sequence 



1 11 21 31 41 51 

1 I I I I I 

MMCSSLEQAL AVLVTTFBinr SCQEGDKFKL SKQEMKBLLB KELPSFVGBS SSPCAVRAFR 

VBLFlSrPVXGD LRNQSPBGKS DCPKITQKMR KHMRHG 



Seq ID NO: 13 DNA sequence 

Nucleic Acid Accession #t Eos sequence 

Coding sequence: 58-354 

1 11 21 31 41 51 

11)111 
GTGAGCTCAC CATGTGGGGG T6AGGCTGAG AGAAAACAAG TACACAGCCA CAGATCCATG 
ATGTGCAGTT CTCTGGAGCA GGCGCTGGCT GTGCTGGTCA CTACCTTCCA CAAGTACTCC 
TGCCAAGAGG GCGACAAGTT CAAGCTGAGT AAGGGGGAAA TGAAGGAACT TCTGCACAAG 
GAGCTGCCCA GCTTTGTGGG GGAGAAAGTG GATQAGGAGG GGCTGAAGAA GCTGATGGGC 
AGCCTGGATG AGAACAGTGA CCAGCAGGTG GACTTCCAGG AGTATGCTGT TTTCCTGGCA 
CTCATCACTG TCATGTGCAA TGACTTCTTC CAGGGCTGCC CAGACCGACC CTGAAGCAGA 
ACTCTTGACT TCCTGCCATG GATCTCTTGG GCCCAGGACT GTTGATGCCT TTGAGTTTTG 
TATTCAATAA ACTTTTTTTG TCTGTTGATA ATATTTTAAT TGCTCAGTGA TGT TCCAT AA 
CCCGGCTGGC TCAGCTGGAG TGCTGGGAGA TGAGGGCCTC CTGQATCCTG CTCCCTTCTG 
GGCTCTGACT CTCCTGGAAA TCTCTCCAAG GCCAGAGCTA TGCTTTAGGT CTCAATTTTG 
GAATTTCAAA CACCAGCAAA AAATTGGAAA TCGAGATAGG TTGCTGACTT TTATTTTGTC 
AAATAAAGAT ATTAAAAAAG GCAAATACCA 



Seq ID NO: 14 Protein sequence: 
Protein Accession #t NP_005969.1 

I 11 21 31 41 51 

) 1 I 1 i i 

MMCSSLEQAL AVLVTTFHKY SOQBQOKFKL SKOEMKELLH KELPSFVGBK VDEBGLKKLM 
G8LDENSDQQ VDFQBYAVFL ALITVMCNDF FQGCPDRP 



Seq ID NO: 15 DNA sequence 

Nucleic Acid Accession «: Eos sequence 

Coding sequence: 62-358 



1 11 

i I 
GGAGG6TGTG CCGCTGAGTC 
CATGATGTGC AOTTCTCTGG 
CTCCTGCCAA GA6GGGGACA 
CAAGGAGCTG CCCAGCTTTG 
GGGCAGCCTG GATGAGAACA 
GGCACTCATC ACTGTCATGT 
CAGAACTCTT GACTTCCTGC 
TTTGTATTCA ATAAACTTTT 
ATAACCCGGC TGGCTCAGCT 
TCTGGGCTCT GACTCTCCTG 
TTIGGAATTT CAAACACCAG 



21 31 

i 1 
ACTGCCTGGG CATCTGGGCC 
AGCAGGOGCT GGCraVGCTG 
AOTTCAAOCT 6A0TAAGGGG 
TGGGGGAGAA AGTG6ATGAG 
GTGACCAGCA GGTGGACTTC 
GCAATGACTT CTTCCAGGGC 
CATGGATCTC TTOGGCCCAG 
TTTGTCTGTT 6ATAATATTT 
GGAGTGCTGG GAGATGAGGG 
GAAATCTCTC CAAGGCCAGA 
CAAAAAATTG GAAATOGAGA 



41 51 

I I 
TGGAACCTGG GCCACAGATC 
GTCACTACCT TCCACAA0TA 
GAAATGAAGQ AACTTCT6CA 
GAGGG6CTGA AGAAGCTGAT 
CAGGAGTATG CTGTTTTCCT 
TGCCCAGACC GACCCTGAAG 
GACTGTTGAT GCCTTTGAGT 
TAATT6CTCA GTGATGTTCC 
CCTCCTGGAT CCTGCTOCCT 
OCTATGCTTT AGGTCTCAAT 
TAGGTTGCTG ACTTTTATTT 
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TGTCAAATAA A6MATTAAA. ACCH 

Seq ID HOx 16 Protein sequence: 
Protein Accession #: NP_005969.l 

1 11 21 31 41 51 

\ \ \ \ \ \ 

HNCSSLEQAL AVLVTTPHKY SCQEGDKFKL SKGEMKELIiH KEIiPSFVGEX VDEEGLKKLM 60 

GSLDENSDQQ VDFQEYAVFL ALITVMCNDP FQQCPDRP 



Seq ID NO: 17 DNA sequence 

Nucleic Acid Accession #: Eos sequence 

Coding sequence: 939-2372 

1 11 21 31 41 51 

I I I I I i 

AAGACGGATT CTCAGACAAG GCTTGCAAAT GCCCXX3CAGC CATCATTTAA CTGCACCCGC 60 

AGAATAGTTA GGGTTTGTCA CCGGACGCTC COQGATOGCC TAATTTGTCC CTASIGAGAC 120 

CCCGAGGCTC TGCCCGCGCC TGGCTTCTTC GTAGCTGGAT GCATATCOTG CTCCGGGCAG IBO 

CGCGGGCGCA GGGCACGCGT TCGCGCACAC CCTAGCAC3VC AT6AACACGC GCAAGAGCTG 240 

AACCAAGCAC GGTTTCCATT TCAAAAAGGG AGACAGCCTC TACCGC3GATT GTAGAAGAGA 300 

CTGTGGT6T6 AATTAG6GAC 06GGAGG06T CGAA066AGG AACGGTTCAT CTTA6AGACT 360 

AATTTTCtGG AGTTTCT6CC GCTGCTCT6C GTCAGCCCrC AC3QTCACTTC 6CCAQCAGTA 420 

GCAGA6G0G6 CGGOGGGGGC TCCC66AATT GGGTTGQAGC AOGAGCCTCS CT06CT6CTT 4B0 

CGCTCGCGCT CTACGCGCTC AGTCCCOGGC GGTAGCAGGA GCCTGGACCC AGGCGCCGCC 540 

GGCGGGOSTG AGGCGCCOOA GCCCGGCCTC GAGGT6CATA CCGGACCCCC ATTCGCATCT 600 

AACAAGGAAT COXKXSCCCCA GAGA6TCCCG GGAGGGCCGC CGGTCGGT6C CGGG06COCC 660 

G6QCCATGCA GOGAOGGGCG GCX3CGGA6CT CCXSAGCAGOG GTAGC6CCCC CCTGTAAA6C 720 

GGTTCGCTAT GCCGGGGCCA CTGTGAACCC TGCCGCCTGC CGGAACACTC TTOGCTCCGG 780 

ACCAGCTCAG CCTCTGATAA GCTGGACTCG GCACGCCOGC AACAAGCACC GAGGAGTTAA 840 

GAGAGCOGCA AGCGCAGGGA AGGCCTCCCC GCACGGGTGG GGGAAAGCX3G CCGGTeCAGC 900 

GGGGGGACAG GCACTCGGGC TG6CACTGGC TGCTAG6GAT GTCGTCCTTGG ATAAGGTGGC 960 

AT6GACCG6C CATGG06CGG CTCTGGGGCr TCTGCrGGCT 6GTTGTGG0C TTCTGGAQGO 1020 

CG6CTTTCGC CTGTCCCACQ TCCT6CAAAT GGAOTGCCTC TCQGATCTGG TGCSlQCaACC 1080 

CTTCTCCTGG CATCX5TGGCA TTTCCGAGAT TGGAGCCTAA CAGTGTAGAT CCTGAGAACA 1140 

TCACCGAAAT TTTCATCGCA AACCAGAAAA QGTTAGAAAT CATCAAOGAA GATGATGTTG 1200 

AAGCTTATGT GGGACTGAGA AATCTQACAA TTGTGSATTC TGGATTAAAA TTTGTGGCTC 1260 

ATAAAGCATT TCTGAAAAAC AGCAAGCIQC AKSGAGATCAA TTTTAGCGQA AACAAACTGA 1320 

CGAGTTPGTC TAGGAAACAT TTCOGTCRCC TTGACTTGTC T6AACT6ATC CTGGTGGGCA 1380 

ATCCATTTAC ATGCTCCPGT GACATTATOT GGATCAAGAC TCTCCAAQAG GCTAAATCCA 1440 

GTCCAGACAC TCAQGATTTG TACTGCCTGA ATGAAAGCAG CAAGAATATT CCCCTGGCAA 1500 

ACCTGCAGAT ACCCAATTGT GGTTTGCCAT CTGCAAATCT GGCCGCACCT AACCTCACTG 1560 

TGGAGGAAGG AAAGTCTATC ACATTATCCT GTAGTGTGGC AGGTGATCCG GTTCCTAATA 1620 

TSTATTGGGA TOTTGGTAAC CTGOTTTCCA AACATATGAA TGAAACAAGC CACACACAGG 1680 

GCTCCTTAAG GATAACTAAC ATTTCATCCG ATGACAGTGG GAAGCAGATC TCTTGTGTGG 1740 

CGGAAAATCT T6TAGGAGAA GATCAAGATT CTGTCAACCT CACTGTGCAT TTTGCACCAA 1800 

CTATCACATT TCTCX3AATCT CCAACCTCAG ACCACCACTG 6TGCATTCCA TTGACTGTGA 1860 

AAGGCAACCC CAAACCAGCG CTTCAGTGGT TCTATAAOGG GGCAATATTG AATGAGTCCA 1920 

AATACATCTG TACTAAAATA CATGTTACCA ATC3VCACGGA GTACCACGGC TGCCTCCAGC 1980 

TGGATAATCC CACTCACATG AACAATGGGG ACTACACTCT AATAGCCAAG AATGAGTATG 2040 

6GAAGGATGA GAAACAGATT TCTGCTCACT TCATOGaCrG 6CCTGGAATT GA06ATG0TG 2100 

CAAACCCAAA TTATCCTGAr GTAATTTATG AAGATTATG6 AACTGCA606 AATGACATC5G 2160 

GGGACACCAC GAACAGAAGT AATGAAATCC CTTCCACAGA CGTCACTGAT AAAACOSGTC 2220 

GGGAACATCT CTOGGTCTAT GCTGTGGTGG TGATTGCGTC TGTGGTGGGA TTTTGCCTTT 2280 

TGGTAATGCT GTTTCTGCTT AA6TTGGCAA GACACTCCAA GTTTGGCATO AAAGGTTTTG 2340 

TTTTGTTTCA TAAQATCOCA CT6GAXGGGT AGCTGAAATA AAG6AAAAGA GA6AGAAAGQ 2400 

GGCTGTGGTG C Tl XS T lW n ' GATGCTGCCA TGTAAGCTGG ACTCCTGGGA CT GCTG TTGG 2460 

CTTATCCCGG GAAGTGCT6C TTATCTGGGG TTTTCTGGTA 6ATGTGGGCG GTGTTTGGAO 2520 

GCTGTACTAT ATGAAGCCTG CATATACTGT QAGCTGTGAT TGGGGAACAC CAATGCAGAG 2580 

GXAACTCTCA GGCRGCTAAG CAGCACCTCA AGAAAACATG TTAAATTAAT GCTTCTCTTC 2640 

TTACAGTAGT TCAAATACAA AACTGAAATG AAATOCCATT 6GATTGTACT TCTCTTCTGA 2700 

AAAGTGTGCT TTTTGACCCT ACTGGACATT TATTGACTTA ATTGCTTCTG TTTATTAAAA 2760 

TTGACCTGCA AAGTTAAAAA AAAATTAAAG TTGAGAACAG GTATAAGT6C ACACTGAATA 2820 

GTCTAATCTA CATGTAACAC ATATTTTAQT GT6ATTTTCT ATACTCTAAT CAGCACTGAA 2880 

TTCAGAGGOT TTGACTTTTT CATCTATAAC ACAGTGACTA AAAGAGTTAA GGGTATATAT 2940 

AOCATCSUrrr TGGGACTTGO TAGTATTATT AAAAGGTTAT TTCCTTCACT 6TC3iATAAAA 3000 

GTCX3\AATGT TTAGCTTAGG TCTGAQAGTC AAACAATGTT AAGGATTGTC TTAAA6TTCC 3060 

TTAGCCAGCA AAACAAAACA AAACAAAACA AACAAATGAA AAACGTTTAA AAA6AAGAAG 3120 

AAGAAAAAAA AC3^AGAACAA GCAGCAACAG CTGTTTTGTT GGGGCTATAG ATTT7A6TTA 3180 

GGCATAGTCA ATTTCAGAAT AACTAAGAGT GGAATATATG CATATGGTQA AATTATAACC 3240 

TTGCCCTTTT TTATTTGCCC TCTGOGATCC ACXTTGCTTTT TAQAAGTCTQ GOQAGTGAQA 3300 

AGGCXACAGT ATCTCATGCT GTTTGCATTA CAGAACTGCA 6CTTTTCXAC TCTGAAAA6G 3360 

CCTGGGAGCA GAATGGCTGG CCTGCTGTGA GCAC3GA0AGG AGATTCTAA6 AAGGATAGTC 3420 

CCCX:CTACAA CATACTGTCA TACTGCTGGG TTTTCaTGGG TAGGAAAGCT TGTCCTGACC 3480 

CX3VGCAGCAA AGAGGTGGCA QGTC3GCPAAT GAATATATGC TTTATAATGT CCTTCTTCAT 3540 

TGCTGAGA6G GCAOCCTTAO AQCTGTGGAT TTCTGCATCC CCCCTGAGTC TGACCCATCG 3600 

ACACCTGTTT CATTCACTTT AGCATCACAG TGACCTTTGT ATGCTCTGTT CAGTCTGTGT 3660 

CAGGCAGTAT GCTTGTCCTG AAGA6AGGTT TGGCTATCCC CACCCCACCC CACCCCACXrC 3720 

TGTTCCTTTT TTATCAGGAG GACTTCAGAG CCAGGCCTGC AGCATTTTGT TTGAAAACAC 3780 

AATCAGCTCT GACAGTTAGA CATGCACACA GACGCCATAG CTGGATTGGA A ACAT TGATG 3840 

TTTTAAAAAT TTATTTTTTT TG6AAATAGT T6CACAAATG CTGCAATTTA GCTTTAAGGT 3900 

TCTATAGATT TTTAACTAGT CCAACACAGT CAQAAACATT GTTTTQAATC CTCTGTAAAC 3960 

CAAGGCATTA ATCTTAATAA ACXaCGATCC ATTTAGGTAC CACTT6ATAT AAAAAGGATA 4020 

TCCATAATGA ATATTTTATA CTGCATCCTT TACATTAGOC ACTAAATAGG TTATTGCTTG 4080 

ATGAAGACCT TTCACA6AAT CCTATGGATT 6CA0CATTTC ACTTGGCXAC TTCATACCCA 4140 
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TGCX^TTAAAG AGGGGCAGTT TCTCAAAAGC AGAAACATGC CXSCCAGTTCT CAAGTTTTCC 4200 

TCCTAACTCC ATTTGAATGT AAGGGCAGCT GGCOCCCAUT GTGGOQAGGT C08AACATTT 4260 

TCTGAATTCC CATTTTCTTG TTCGCGGCTA AATGACAGTT TCTGTCATTA CTTAGA.TTCC 4320 

GATCTTTCCC AAAGGTGTTO ATTTAC3VAAG AGGCCA6CTA ATAGCAOAAA TCATGACCCT 4380 

GAAAGAGAGA TGAAATTCAA GCTGTGAGCC AGGCAQGASC TCAGTAT6QC AAAGGTTCTT 4440 

6AGAATCAGC CA.TTTG6TAC AAAAAAGATT TTTAAAGCTT TTAT6TTATA CCATGOAGCC 4500 

ATAGAAAGOC TATGGATTGT TTAAOAACTA TTTTAAAGTG TTCO^OACCC AAAAAGGAAA 4560 

AA7AAAAAAA AAGGAATATT TGTACOCAAC AGCTASAAGQ ATTGCAAOGT AGATTTTZGT 4620 

TTTAAAATG6 AGA6AAGT06 ACAGATAAGG CCATTTAATA TATCAAAGAT CA6TTGACAT 4680 
CTOCTAGGGA ATGATGAAAA CAQGAGGCTA T 

Seq ID NOt 18 Protein sequence: 
Protein Acceasion CAAS3571 

1 H 21 31 41 51 

I I I I 1 I 

NSSWIRNHGP AMARLN6FCH LWGFWRAAF ACPTSCKCSA SRIWCSDPSP GZVAFPRLEP ' 60 

NSVDPENITB ZPIAMQKRLB IIBBDDVEAY VGLBNLTIVD SGLKFVAHKA FLKNSNLQHZ . 120 

NFTRNKLTSL SRKHFRHLDL SBtilLVBNPP TCSCBIMMIK TLQEAKSSPD TQDLYCLNES 180 

SKNIPLANLQ IPNCGLPSAN LAAPNLTVEB GKSITLSCSV AGDPVPNMYW DVGNLVSKHM 240 

NETSHTQGSL RITNISSDDS GKQISCVAEN LVGEDQDSVN LTVHFAPTIT PLESPTSDHH 300 

HCIPFTVKGN PKPAtiQHFYN OAlMIBSKyi CTKIHVTNHT BYHGCLQLDK PTHMNNGDYT 360 

LIAKNEYOKD EKQISAHFMO NF6IDDGANP NYFDVIYEDY GTAAMDIGDT TNRSNEIPST 420 
DVTDKTGRKH LSVYAVWIA SWGFCLLVM LFLI*RLARHS KF6MXGFVLF HKIPLDG 



Seq'lD KO: 19 ONA sequence 

Nucleic Acid Accession #: NM_O0O226 

Coding sequence: 82-3600 

1 11 21 31 41 51 

1111)1 

GCTTTCAGGC GATCTGGAGA AAGAACGGCA GAACACACAG CAAGGAAAGG TCCTTTCTGG 60 

GGATCACCCC ATTGGCTGAA GATGAGACCA TTCTTCCTCT TGTGTTTTGC CCTGCCTGGC 120 

CTCCTGCATG CCCAACAAGC CTGCTCCCGT GGGGCCTGCT ATCCACCTGT TGGG GACC TG 180 

CTTGTTGGGA GGACCCGGTT TCTOOSAGCT TCATCTAGCT GTaGACTGAC CAAfiOCTGAO 240 

ACX:TACTGCA CCXSIGTATGG CGAGTGGCSVG ATGAAATGCT GCAA6TGTGA CTCCA6GCAG 300 

CCTCACAACT ACTACAGTCA COGAGTAGAG AATGTGGCTT CATCCTCCGG CCCCATGOGC 360 

TGGTGGCAGT CCCAGAATGA TGTGAACCCT GTCTCTCTGC AGCTGGACCT GGACAGGAGA 420 

TTCCAGCTTC AAGAAGTCAT GAT6GAGTTC CAGGGGCCCA TGCCOSCGGG CATGCTGATT 480 

GAS06CTCCT CAGACTTOSG TAA8ACCTG0 CQAOTOTACC AGIACCTG6C TGCCGACIGC 540 

ACCTCCACCT TC3CCTCGQGT COSCCAGGGT OQGCCTCAGA GCTGGCAGGA TGTT06GTGC 600 

CAGTCCCTGC CTCAGAGGCC TAATGCAC6C CTAAATG6GG GGAABGTCCA ACTTAACCTT 660 

ATG6ATTTA6 TGTCTQGGAT TOCAGCAACT CAAAGTCAAA AAATTCAAGA GGTGGGGGAG 720 

ATCACAAACT TGAGAGTCAA TTTCACCAG6 CTGGCCCCTG TGCCCCAAAS GGGCTACCAC 780 

CCTCCCAGG6 CCTACTATQC TOTOTCCCAO CTGCSTCTGC AGGGGA6CI0 CTTCTGTCAC 840 

Q6CCAT0CTG ATOSCTGCXSC ACCCAAGCCT GGGGCCTCTG CAGGCCXX:TC GACOSCTGTG 900 

CAGQTCCACG ATGTCTGTGT CTGCCAGCAC AACACTGCCG GCCCAAATTG TQAGOGCTGT 960 

GCACCCTTCT ACAACAACCG GCCCTGGAGA COGGCGGAGG GCCAGGACX3C CC ATGAAT GC 1020 

CAAAG6TG03 ACT6CAATG6 GCACTCAGA6 ACATGTCACT TTGACCCC6C TGTGTTTGCC 1080 

GCCA6CCAGG GG6CATATGG AGGTGrGTGT GACAATT6CC 6GGA0CACAC OSAAGGCAAG 1140 

AACT6TGA6C GGTGTCAGCT GCACTATTTC C66AACG6GC 60C0GGGAGC TTCCATTCAG 1200 

GAGACCTGCA TCTCCTGCGA GTGTGATCOS GATGGGGCAG TGCCAGGGGC TCCCTGTGAC 1260 

CCAGTQACOG GOCAGTGTGT GTGCAAGGAG CATGTGCAGG GAGAGCGCTG TGACCTATGC 1320 

AAGCCGGGCT TCACTGQACT CACCTACX3CC AACC0GCA6G GCTQCCACOG CTGTQACTGC 1380 

AACATCCIOG GQTC0CX3GAG G6ACATGGCG TGTGA06AGG A6AGT6GG0S CIGCCTTTGT 1440 

CIGCCCAACX3 TGGTGGGTCC CAAATGTGAC CAGTGT6CTC CCIACCACTG GAA6CTGGCC 1500 

AOTGOCCAGO GCTOTGAACC GTGTGCCTGC GACCOGCACA ACTCCCCTCA GCCCACAGTG 1560 

CAACCAGTTC ACAGGGCAGT GCCCTGTCGG GAAGGCTTTO GTGQCCTQAT GTGCAGOGCT 1620 

GCAGCCATCC GOCAGTGTCC AGACCGGACC TATGQAGACG TGGCC3iCAGG ATGCOSAGCC 1680 

TGTGACTGTO ATTTCCGGGG AACAGAGGGC CCQGGCTGGS ACAAG6CATC AGGCCGCTGC 1740 

CTCIGCOGCC CTGGCTTGAC CGCSXSCCCGC TGTGACCAGT GOCaGOGAGG CTACTGCAAT 1800 

CGCTACCC60 TGTGCGTGGC CTGCCACCCT TGCTTCCAGA CCTATGATGC GGACCTCCGG 1860 

GAGCAG6CCC TGCGCTTTGQ TAGACTCCX3C AATGCCACC6 CCAGCCT6TG GTC3V6GGCCT 1920 

QGGCTGGAGG ACCGTQGCCT GGCCTCCCX3G ATOCTAGATG CAAAGAGTAA GATTGAGCAQ 1980 

ATOOQAOCAO TTCTCAOCAG CCCOGCAGTC ACAGA6CAGG AGGTGGCTCA GGTGGCCAGT 2040 

OCCATCCTCT CCCTCAGGCQ AACTCTCCA6 GGCXTTOCAGC TGGATCTGCC CCTGGAGQAG 2100 

GAGAOGTTGT CCCTTCCGAG AGACCTGGAG AGTCTTGACA GAAGCTTCAA TGGTCTCCTT 2160 

ACTATGTATC AGAGGAAGAG GGAGCAGTTT GAAAAAATAA GCAGT6CTQA TCCTTCAGGA 2220 

GCCTTCC3GGA TGCTGAGCAC AGCCTACGAG CAGTCAGCCC AQ6CTGCTCA GCAGGTCTCC 2280 

GACAGCTOSC GiXTrTTGGA CCAGCTCAGO GACAGCCGGA GAGA6GCAGA GAGGCTGGTG 2340 

OGGCAG6CX3G GAGGAGGA6G AGGCACOGGC AGCXXXZAAGC TTGTGGCCCT GAGGCTGGAG 2400 

ATGTCTTOGT TGCCTGACXTT GAC31CCCACC TTCAACAAGC TCTGTGGCAA CTCCAGGCAG 2460. 

ATGGCTTQCA CCCCAATATC ATGCCCTGGT GAGCTATGTC CCCAAGACAA TGGCACAGCC 2520 

TGTGGCTCCC GCTGCAGGGG TQTCCTTCCC AGQGCCGGTG GGGCCTTCTT GATGGOGGGG 2S80 

CAGGTGGCTG AQCAGCTGCG GGGCTTCAAT GCCCAGCTCC AGCX3GACCAG GCAQATGATT 2640 

AGQGCAGCOG AGGAATCTGC CTCACAGATT CAATCCAGTG CCCAGCX3CTT GGAGACCCAG 2700 

GTGA6CGGCA GCGGCTCCCA GATGGAGGAA 6ATCTCA6AC GCACACXSGCT CCTAATCCAG 2760 

CA6GTCCGGG ACTTCCTAAC AGACCCCGAC ACTGATGCAG CCACTATCCA GOAGQTCAGC 2820 

GRGGCX:GTGC TGGCCCTGTG GCTGCCX^VCA GACTCABCTA CTGTTCTGCA QAA6AT8AAT 2880 

GAGATCCAGG CCATTGCAGC CAGGCTCCCC AAOGTGGACT TGGTGCTGTC CCAGACCAAG 2940 

CAGGACATTG CGCGTGCCCG C06GT7GCAG GCTGAGGCTG AGGAAGCCAG GAGCC6AGCC 3000 

CATGCAGTGG AGGGCCAGGT GGAAQATGTG OTTGGGAACC TGCG6CA8GG GACAGTGGCS^ 3060 

CTGCAG6AAG CTCAGQACAC CATGCAAGGC ACCAGCCQCI GCCCTG6GCT TATCCA6GAC 3120 

AGGGTTGCTG AGGTTCAGCA GGTACTGGGG CCAGCAGAAA AGCtGGTGAC AAGC%TGACC 3180 

AAGCAGCTGG GTGACTTCTG GACACGGATG GAGGAGCTCC GCCACCAAGC CCGGCAGCAG 3240 

GGGQCAGAG3 CAGTCCAGGC CCAGCAGCTT GCGGAAGQTG CCAGOGAQCA GGCATTGAGT 3300 

GCCCAAGAG6 GATTTGAGAG AATAAAACAA AAGTATGCTG AGTTGAAGGA CCX3GTIGGGT 3360 
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CAQAGTTCXa TGCTGGGTQA GCAGGGTGCC CGGATCCAGA GTGTQAAGAC AGAGGCAQAG 
QAOCTGTTTG GG6A6ACCAT GGAGATGATG GACAGGATGA AAGACATGGA GTTGGAGCTG 
CTGOGGGGCA GCCAGGOCAT CATGCTGCGC TCGGC36GACC TGACAGOACT GGAGAAGCGT 
GTGGAGCAGA TCCQTCSACCIV CATCAATGG6 C3GCGTGCrCT ACTA TGCCA C CTGCAAGTGA 
TGCTACAGCr TCCAGCCCGT TGCCCCACTC ATCTGCCGCC TTTGCTTTTG GTTGGGGG^ 
GATTGGGTTG GAATGCTTTC CATCTCCAGG A6ACTTTCAT GCAGCCTAAA GTACAGCCTG 
GACCACCCCT GGTGTGTAGC TAGTAAGATT ACCCTQAGCT GCAGCTGA6C CTGAGCOAT 
GGGACAGTTA CACTTOACS^G ACAAAGATGG TGGAGATTGG CATGCCATTG AAACTAAQftfi 
CTCTCAAGTC AAGOAAOCTO GGCXGGOCftG TATCCCCCX5C CTTTAGTTCT OaCTGGGGA 
GGAATCCTGO ACCAAGCACA AAAACTTAAC AAAAOTOATO TAAAAATOAA AMCCAAATA 
AAAATCTTTG G 



3420 
3480 
3540 
3600 
3€60 
3720 
37B0 
3840 
3900 
3960 



Seq ID NO: 20 Protein sequence: 
Protein Accession «i NP_000219 



I 

MRPFFUiCFA 
EHQNKCCKCD 
MEFQGFMPA6 
NARIiNGGKVQ 
VSQLRLQGSC 
PHRPABGQDA 
HyFRNRRPQA 
TYANPQGCHR 
CACDPHNSPQ 
TEGPGCDKAS 
RLBNATASLW 
TLQGLQIiDLP 
AYEQSAQAAQ 
TPTPNKLCGN 
GFNAQLQRTR 
DPDTDAATIQ 
RLQAEAEEAR 
VLRPABKLVT 
IKQKYAELKD 
lOiRSADLTGb 



11 

I 

LPGIiLHAQQA 

SRQPBinnrsH 

MLIER8SDFG 
LHLMDLVSGI 
PCHGHADRCA 
HBCQRCDCHG 
SIQBTCX8CE 
CDCNILGSRR 
PTVQPVHHAV 
GRCLCRPGIiT 
SGPGLEDRGIi 
XiEEBTIiSLFR 
QVSDSSRLUD 
SRQMACTPIS 
QMIRAAEBSA 
EVSEAVLALW 
SRAHAVEGQV 
SMTKQLGDFW 
RLGQSSMLGE 
EKRVEQZRDH 



21 

1 

CSRGACYPPV 
RVEtlVASSSG 
KTSIRVYQYLA 
PATQSQKIQE 
PKPGASAGPS 
HSSTCHPDPA 
CDPDGAVPGA 
DMPCDEBSGR 
PCREGFGGLM 
6PRCDQCQRG 
ASRZLDAKSK 
DLESXjDRSFN 
QIiRDSRREAE 
C9GELCPQW 
SQIQSSAQRIi 
IiPTDSATVLQ 
EDWGNLRQG 
TRMEEliREQA 
QGARIQ8VKT 
INGRVLYYAT 



31 
I 

GDLLVGRTRF 
PMRWHQSQND 
ADCTSTPPRV 
VGEITMIiRVH 
TAVQVHDVCV 
VFAASQGAYG 
PCDPVTGQCV 
CLOjPHVVGP 
CSAAAIRQCP 
YCNRYPVCVA 
ZEQIRAVLSS 
GtJ^TMYQRKR 
RLVRQAGGGG 
GTACX3SRCRG 
ETQVSASRSQ 
KMNEIQAIAA 
TVAIiQEAQDT 
RQQGABAVQA 
EAEELFGETM 
CK 



41 
1 

LRASSTGGLT 
VNPVSIK2LDL 
RQGRPQSWQD 
FTRLAPVPQR 
OQBHTAGPNC 
6VCDNCRDBT 
CKEHVQGERC 
KCDQCAPYHW 
DRTYGDVATG 
CHPCPQTYnA 
FAVTBQSVAQ 
EQFEKISSAD 
GTGSPKLVAL 
VLPRAGGAFL 
MEBDVRRTRL 
RliPNVDLVLS 
MQGTSRSLRL 
QQLAEGASEQ 



51 
1 

KPETYCTQY6 
DRRPQIiQEVM 
VRCQSLPQRP 
GYHPPSAYYA 
BRCAPFYHNR 



DLCKPGFTGL 
KLASGQGCEP 
CRACDCDFRG 
DLREQAIJIFG 
VASAILSLRR 
PSGAFRMLST 
RLEMSSLPDli 
MAGQVAEQLR 
LIQQVRDFLT 
QTKQDIARAR 
IQDRVAEVQQ 
ALSAQEGPER 
IiELLRGSQAI 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 



Seq ID NO: 21 DNA sequence 
Nucleic Acid Accession #: NM_003722 
Coding sequence: 145-1491 



11 



21 



TCGTTGATAT 
ACAGTACTGC 
AAA6AAAGTT 
CCAQAGGTTT 
ATTGACTTQA 
AGCATGGACT 
AGGAACCTG6 
AGTOOCTATA 
CCCAGCTCCA 
CXAGGCCCGC 
TGGACGTATT 
CAGATCAAG6 
AAAAAAGCTG 
GAATTCAACG 
CATGCCCAGT 
CCACCCCAGG 
TGTGTTGQAG 
GGGCAAGTCC 
AGGAAGGCGG 
GATGGTACOA 
AAAGQAAGAT 
GAAATGCTGT 
ATTGAAACGT 
CTTTCAGCCT 
GACGTCTTCT 
TCTATATTTT 
TGTGTGTGCG 
CCCAACT6CT 
TTACAAQAAA 
GAACCACTGT 
OAAAGGGGCA 
AATTCACAGG 
AAAAAAGTTG 
CCCTTTTAAT 
TACTGCTGGG 
TTTGTGA6AA 
GCTGTGTACC 
CATGAAACCC 
CTCATTTTGT 
TGTTTACCAT 
AATTTGCTTA 
CTGATACTGT 
AGACGTGTTA 



CAAAGACAGT 
CCTGACCCTT 
ATTAGOQATC 
TGCAQCATAT 
ACTTTGTGGA 
GTATCCGCAT 
GGCTCCTGAA 
ACACAGACCA 
CCTTCGATGC 
ACAGTTTCGA 
CCACTGAACT 
TGATGACCCC 
AGCACGTCAC 
AGGGACAGAT 
ATGTAGAAGA 
TTQQCACTQA 
GGATGAACCG 
TGGGCCGACG 
ATGAAGATAG 
AGCGCCCX3TT 
CGCCAGATGA 
TGAAGATCAA 
ACAGGCAACA 
GCTTCAGGAA 
TTAGACATTC 
AAGTGTQTGT 
TGTQTATCTA 
CAAAGGCACA 
GQATGTTTTC 
GTTTGTCTGT 
TTAAGAT6TT 
GAAGCTTTTG 
TTATTGTCTG 
GCTGOTCATG 
CAGCGAGGTG 
CTTGCATTAT 
TGCCTCTGCC 
TGGAAGACCT 
GCTTTTAATA 
TATTCAAA6C 
ATTAGAGCTT 
TCAGTGCATT 
AAATCAGCAC 



TGAAGGAAAT 
ACATCCAGCG 
C3«XyiTGTCC 
CTQGGATITT 
TGAACCATCA 
GCAGGACTCG 
CAGCATG6AC 
aSOGCAGAAC 
TCTCrCTCCA 
OSTGTCCTTC 
GAAGftAACrC 
ACCTOCTCAO 
GGAGGTGGTG 
TGCCCCTCCT 
TCCCATCACA 
ATTCAGC3ACA 
CCGTOCAATT 
CTGCTTTGAG 
CATCAGAAAG 
TCGTCAGAAC 
T6AACTGTTA 
A6AGTCCCTG 
GCAACAGCAG 
TGAGCTTGT6 
CAAGCCCCXA 
GTTGTATTTC 
GCCCTCATAA 
AAGCCACTAG 
TGCAGATTTT 
GAGCTTTCTG 
TATTGGAACC 
AGCAGGTCTC 
TGCATAA6TA 
TAATAATATT 
ATCATTACCA 
TTGTGTCCTC 
ACTGTATGTT 
ACTACAAAAA 
OAAAOAGAAA 
TCAAAATAGA 
CTATCCCTCA 
TAGCCAGGAG 
TCCTGGACTG 



31 
1 

GAATTTTGAA 
TTTOGTAGAA 
CAGAGCACAC 
CTGGAACAGC 
GAAGATGGTG 
GA0CTGA0T6 
CAGCAGATTC 
AG0GTCA06G 
TCACCCGCCA 
CAGCAGT06A 
TACTGOCAAA 
GGAOCTGTTA 
AAGCGGTQCC 
AGTCATTTGA 
GGAA6ACAGA 
OTCTTQTACA 
TTAATCATT6 
GCCCGGATCT 
CAGCAAGTTT 
ACACATGGTA 
TACTTACCAG 
GAACTCATGC 
CAGCACCAGC 
GAGCCCCGGA 
AACCX3ATCAG 
CATGTGTATA 
ACAGGACTTG 
TGAGAGAATC 
GT ATCC TTAG 
TTOTTTCCTG 
CTTTTCTGTC 
AAACTTAAGA 
AGTTGTAGGT 
6CAAGTAGTA 
AAAGTAATCA 
CCCTCATGTG 
GGCATCTGTT 
AACTGTTGTT 
TCCACCCCAG 
ATTFGAAGCC 
AGCCTACCTA 
ACTTACGTTT 
GAAATTAAAG 



41 

1 

ACTTGACGGT 
ACCCAGCTCA 
AGACAAATGA 
CTATATGTTC 
CGACAAACAA 
ACCCCAT6TG 
AGAACXjGCTC 
CX3CCCTCX3CC 
TCCCCrCCAA 
GCACGGCCAA 
TTGCftAAGAC 
TCC6C36CCAT 
CXaACCATGA 
TTCGAGTAGA 
GTGTGCTGGT 
ATTTCATGTG 
TTACTCTGGA 
GTGCTTGCCC 
CGGACAGTAC 
TCCAGATGAC 
TGAGGGGCCG 
AGTACCTTCC 
ACTTACTTCA 
GAGAAACTCC 
TGTACCCATA 
TGT6AGTGTG 
AAGACACTTT 
TTTTGAAGGG 
ACCGGCCATT 
GGAGGGAGOG 
TTCTTCTGTT 
TQTCTTTTTA 
GACTGAGAGA 
A6AAAC6AA6 
ACTTTOTGGG 
TAGGTAGAAC 
ATGCTAAAGT 
TGGCCCCCAT 
TAATATTGCC 
CTCTCACAAA 
CCATAAAACC 
TGAGTAAGTG 
ATTGAAAGGG 



51 
I 

GTGCCACCCT 
TTTCTCTTGG 
ATTCCTCA6T 
AGTTC3W3CCC 
GATTGAGATT 
GCCACAOTAC 
CTCGTCCACC 
CTAOGCaCAG 
CACOGACTAC 
GTCGGCCACC 
ATGCCTCMC 
GCCX'GTCX'AC 
GCTGAGCOGT 
GGGGAACAGC 
ACCTTATGAG 
TAACAGCAGT 
AACCAGAGAT 
AGGAAGAGAC 
AAAGAAOGGT 
ATCCATCAAG 
TGAGACTTAT 
TCAQCACACA 
GAAACATCTC 
AAAACAATCT 
6AGC0CTATC 
TGTGTGTGTA 
GGCFCAGAGA 
ACTCAAACXrr 
QGTGGGTGAS 
GTCAGGTGGG 
GTTTTTCTAA 
AGAAAAGGAG 
CTCAGTCAGA 
GTGTCAAGTG 
TGGAGAGTTC 
ATTTCTTAAT 
TTTTCTTGTA 
AGCAGGTGAA 
CTTACGTAGT 
ATCT6TGATT 
AGCXATATTA 
AGATCCAA6C 
TAGACTACTT 



60 
120 
180 
240 
300 
360 
420 
4S0 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1660 
1740 
1800 
1660 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2S80 
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TTCTTTTTTT TACTCAAAAG TTTAGAGAAT CTCTGTTTCT TTCCATrTTA AAAACATATT 2640 
TTAAGATAAT ASCftTAAAGA CTTIAAAAAT GTTCCTOCCC TCCATCTTCC CMACCCAGT 2700 
CACCAGCACT GTATTTTCT6 TCACXAAGAC AATGATTTCT T6ITATTGA6 GCTGTTGCTT 3760 
TTGTGGATGT GTGATTTTAA TTTTCAATAA ACTTTTGCAT CTTGGTTTAA AAGAAA 

Seq ID NO: 22 Protein sequence i 
Protein Acceasion #t NP_003713 

1 11 21 31 41 51 

I i I I I I 

MSQSTQTNEF ItSPEVFQHIW DFLEQPICSV QPZOUIFVDE PSEDGAimi EISMDCIRMQ 60 

DSDLSDFMSfP OniHLOLIiNS MDQQIQNGS8 STSPYNTDHA QHSVTAPSPY AQPSSTFDAIi 120 

SPSPAIPSHT DYPGPBSFDV SFQQSSTAKS ATWTYSTEIiK KLYOQIAKTC PIQIICVMTPF 160 

PC6AVIRAMP VYRKAEHVTE VVKRCP13HEL SREFKEGQIA PPSHLIRVE6 NSHAQYVEDP 240 

ITGRQSVLVP YEPPQVGTEF TTVLYNFMOJ SSCVGGMNRR PILIIVTLET RSGQVLGRRC 300 

FEARICACPG RDRRADEDSI R2CQQVSDSTK NGDGTKRPFR QNTBGIQMTS IKKRRSPDDE 360 

UiYXtPVRGRE TYBOiLKIKE SLELMQYLPQ HTZBTYSQQQ QQQBQBLK2K HLLSACFRKE 420 
LVEPRRETPK QSDVFFRHSK PPHRSVYP 



Seq ID NO: 23 DNA sequence 

Nucleic Acid Accession ft: NH_001944.1 

coding sequence I 84-3083 

1 11 21 31 41 51 

1(1111 

TTTTCTTAGA CATTAACTGC AGA06GCTGG CAGGATAGAA GCAGCGGCTC ACTTGGACTT 60 

TTTCACCAGG GAAATCAGAG ACAATGATGG GGCTCTTCCC CAGAACTACA GGGGCTCTGG 120 

CCATCTTOGT GGTGGTCATA TTG6TTCATG GA8AATT606 AATAOAGACT AAAGGTCAAT 180 

ATGATGAAia AGl^TGACT ATGCAACAftG CTAAAA81UVG GGAAAAM3GT GAATGGGTQA 240 

AATT1GCCAA ACCCI6CAGA GAAG6AGAAS ATAACTCAAA AASAAACCCA ATTGOaAGA 300 

TTACTTCAGA TTACCAAGCA ACXXAGAAAA TCACCTAC3CG AATCTCTGGA GTGGQAATOG 360 

ATCAGCCGCX: TTTTGGAATC TTTGTTGTTG ACAAAAACAC TGOAGATATT AACATAACAG 420 

CTATAGTC6A COGGGAGQAA ACTGCAA6CT TCCTGATCAC A TGTCGG GCT CTAAATGCCC 480 

AAGGHCIAGA TGTAQAGAAA CCACTTATAC TAAOQGTTAA AATTITGGAT ATTAATGATA 540 

ATCCTCCAGT ATTTTCACAA CAAATTTTCA TG6GTGAAAT TGAAGAAAAT AST6CCTCAA 600 

ACTCACTGGT GATGATACTA AATGCXACAQ ATQCAGATOA ACCAAACCAC TTGAATTCTA 660 

AAATTGCCTT CAAAATTGTC TCTCAGOAAC CAGCAGGCAC ACCCATGTTC CTCCTAAGCA 720 

GAAACACTGG GGAAGTCGGT ACTTTGACCA ATTCTCTTGA CCGAGAGCAA GCTAGCAGCT 780 

ATCGTCTG6T TGTGAOTGGT OCAGACAAAO ATGGAGAAGG ACTATCAACT CAATGT6AAT 840 

QTAATATTAA AGTGAAAGAT GTCAACGATA ACTTCCCAAT OTTTAGAGAC TCTCAGTATT 900 

CAGCAOGTAT TGAAGAAAAT ATTTTAAGTT CTGAATTACT TCGATTTCAA GTAACAGATT 960 

TGGATGAAGA GTACACAGAT AATTGGCTTG CAGTATATTT CTTTACCTCT GGGAATGAAG 1020 

GAAAnGGTT TGAAATACAA ACTGATCCTA GAACTAATGA AGGCATCCTG AAAGTGGTGA 1080 

AGGCTCTAGA TTATQAACAA CTACAAAGOG TGAAACTTAG TATTGCTGTC AAAAACAAAG 1140 

CT6AATTTCA CCAATGAGTT ATCTCTC6AT ACCGAGTTCA GTCAACCCCA GTCACAATTC 1200 

AGGTAATAAA TGTAAGAGAA GQAATTGCAT TCCGTCCTGC TTCCAAGACA TTTACTGTGC 1260 

AAAAAG6CAT AAGTAGCAAA AAATTGGTGG ATTATATCCT GGGAACATAT CAAGCCATCG 1320 

ATGAGGACAC TAACAAAGCT GCCTCAAATG TCAAATATGT CATGGGAOGT AA06AT6GTG 1380 

GATACCTAAT GATTQATTCA AAAACTGCTO AAATCAAATT TGTCAAAAAT ATQAACOGAG 1440 

ATTCTACTTT CATAGTTAAC AAAACAATCA CA6CTGAGGT TCTQQCXATA GATGAATACA 1500 

0GG6TAAAAC TTCTACAGGC ACGGTATATG TTAGAGTACC OGATTTCAAT GACAATTGTC 1560 

CAACAGCTOT CCT06AAAAA GATGCAGTTT 6CAGTTCTTC ACCTTCCGTG GTTGTCTCOG 1620 

CTAGAACACT GAATAATAGA TACACTGGCC CCTATACATT TGCACTGGAA GATCAACCTG 1680 

TAAAOTTGCC T6CCX3TAT6G AGTATCAOVA COCTCRATGC TAOCTOGOCC CTCCTCAGAG 1740 

CCCAGGAACA GATACCTCCT GGAGTATACC ACATCTCCCT GGTACTTACA GACAGTCAGA 1800 

ACAATOGGTG T6AGATGCCA CXSCAGCTTGA CACTGGAAGT CTGTCAGTGT GACAACAGGG 1860 

GCATCTGTGG AACTTCTTAC CCAACCACAA GCXXnXKJQAC CAG6TATGGC AGGCXGCACT 1920 

CAGGGAGGCT GGGGCCTGCC GCCATCGGCC TGCTGCTCCT TGGTCTCCTG CTGCTGCTGT 1980 

TGGCCCXXX:T TCTGCTGTTG ACCTGTGACT GTGGGGCAGG TTCTACTGGG GGAGTGACAG 2040 

GTGGTTTTAT CCCAGTTCCT GAT6GCTCAG AAGGAACAAT TCATCAGTGG GGAATTGAAQ 2100 

GAGCCCATCC TGAAGACAAQ GAAATCACAA ATATTTGTGT GCCTCCTGTA ACAGCCAATG 2160 

GAOCCGATTT CATGGAAAGT TCTGAAGTTT GTACAAATAC GTATGCCAGA GGCACAGOQG 2220 

TGGAAGGCAC TTCAGGAATG GAAATGACX3V CTAAGCTTGG hOCAGCCRCT GAATCTGGAO 2280 

GTGCTGOVGG CTTTGCAACA GGGACAGTGT CAGGAGCTGC TTCaWSGATTC GGAGCAGCCA 2340 

CTGGAGTTGG CATCTGTTCC TCAGGGCAGT CTGGAACCAT GAGAACAAGG CATTCCACTG 2400 

GAG6AACCAA TAA6GACTAC GCTGATGGGG 06ATAAGCAT GAATTTTCTG GACTGCTACT 2460 

TTTCTCAGAA AGCATTTGOC TGTGCXK3AGG AAGACGAT6G CCAGGAAGCA AATGACTSCT 2520 

TGTTGATCTA TQATAATGAA GGCGCAOATG CCACTGGTTC TCCTGTQG6C TCCX3T00GTT 2560 

GTTGCAGTTT TATTGCTGAT GACCTGGATG ACAGCTTCTT GQACTCACTT GGACXX::AAAT 2640 

rCAAAAAACT TGCAGAGATA AGCCTTGGTG TTGATGQTGA AGGCAAAGAA GTTCAGCCAC 2700 

CCTCTAAAGA CAGCGGTTAT GGGATTGAAT CCTGTGGCCA TCCCATAOAA GTCCAGCAGA 2760 

CAGGATTTGT TAAGTGCCAG ACTTTGTCA.G GAAGTCAAG6 AGCTTCTGCT TT6TC06CCT 2820 

CTGGGTCTGT CCAGCCAGCT GTTTCCATCX: CTGAOCCTCT GCAGGATGGT AACTATTTAG 2880 

TAA06GAGAC TTACTCGGCT TCTQOTTCCC TCOTGCAACC TTCCACTGCA GGCTTTGATC 2940 

CACTTCTCAC ACAAAATGTG ATAGTGACAG AAAGGGTGAT CTGTCCCATT TCCAGTGTTC 3000 

CTGGCAACCT AQCTGGCCCA AGGCAGCTAC GAGGGTCACA TACTATGCTC TGTACAGAGG 3060 

ATCCTT6CTC COSTCTAATA TGACCAGAAT GAGCTGGAAT ACCACACTGA CCAAATCTG6 3120 

ATCTTTGGAC TAAAGTATTC AAAATAGCAT AGCAAAGCTC ACTGTATTGO GCTAATAATT 3180 

TGGCACTTAT TAGCTTCTCT CATAAACTGA TCAOGATTAT AAATTAAATG TTTGOOTTCA 3240 

TACCCCAAAA GCAATATGTT GTCACTCCTA ATTCTCAAGT ACTATTCAAA TTOTAOTAAA 3300 
TCTTAAAGTT TTTCAAAACC CTAAAATCAT attcgc 

Seq ID NOt 24 Protein sequence! 
Protein Accession fti NP_001935.1 

1 11 21 31 41 51 
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I I ' i i I I 

MMQLPPRTTG ALAIFVWIL VHGBLRIETK 6QYDBEEMTM QQAKRRQKRB WVKPAKPCRB 60 

GEDNSKRNPI AKITSDYQAT QKITVRISGV GIDQPPFGIP WDKNTGDIN ITAIVDRBET 120 

PSPLITCRAL NAQGLDVBKP LILTVKILDI NDNPPVFSQQ IPMGBIEENS ASNSLVMILN 180 

ATDADEPNHL NSKIAFKIVS QEPAGTPMFIj LSRNTGEVRT LTNSLDREQA SSYRLWSGA 240 

DKDGE6LSTQ CBCHIKVKDV MDNFPMFRDS QYSARIEENI LSSEIJiRFQV TDItDBEYTSN 300 

HIiAVYFPTSG NEGNWFEIQT DPRTMBGILR VVKAU>YEQL QSVKZiSIAVR NKABFHQSVI 360 

SRYRVQSTPV TIQVINVREG lAFRPASKTP TVQKGISSKK LVDYILGTYQ AIDEDTNKAA 420 

SNVKYVMGRN DGGYLMIDSK TAEIKFVKNM NRDSTPIVNK TITAEVLAID EYTGKTSTGT 480 

VYVRVPDFND NCPTAVLEKD AVCSSSPSW VSARTLNNRY TGPYTFALED QPVKLPAVWS 540 

ITTXiNATSAIi LRAQEQIPP6 VYHISLVLTD SCZBQIRCEMPR SLTLEVOQCD 2<fRGICSTSYP 600 

TTSPOTRYGR PBSORLOPAA lOIiLLLGLIiL UiLAPLLLLT CDCQAGSTOG VTGGFIFVPO 660 

6SB6TIRQW6 lEGAHPBDKE ITNICVPPVT AH6ADPHBSS EVCTNTYAHO TAVEGTSGHE 720 

WTTKLQAATE SGGAAGFATO TVSGAASGFG AATGVGICSS GQSGTMRTRH STG6TNICDYA 780 

DGAIS^4NFIlD SYFSQKAPAC AEEDDGQEAN DCLLIVDNEG ADAT6SPVGS VGCCSFIADD 840 

LDDSFIJ3SL6 PKFKKIiABIS LGVDGEGKEV QPPSKDSGYG IESC6HPIEV QQTGFVKCQT 900 

LSOSQGASAL 5ASG5VQPAV SIPDPLQBGI7 YLVTETYSAS GSLVQPSTAG FOPLLTQHVI 960 
VTERVICPIS SVPGHIiAGPT QLRGSHTMLC TEDPCSRIiI 

Seq ID NO: 25 ONA eequence 

Nucleic Acid Accession #: Eos sequence 

Coding sequencer 56-1642 

1 11 21 31 41 51 

I I I i I 1 

AGTATCCCAG GAG6AGCAAG TGGCACGTCT TCGGACCTA6 GCTGCCCCTO CCGTCATGTC 60 

GCAAGQGATC CTTTCTCCOC CA0GG06CTT GCTGTGGQAT GAOSATGTCG TAGTTTCTCC 120 

CATGTTTGAG TCCACAGCTG CAQATTTGGG GTCTGTGGTA CGCAAGAACC TGCTATCAGA 180 

CTGCTCTQTC GTCTCTACCT CCCTAGAGGA CAAGGAGCaG GTTCCATCTG AGGACAGTAT 240 

GGA6AAGQTG AAAGTATACT TGAGGGTTAG GCCCTTGTTA CCTTCAGAGT TGGAACX3ACA 300 

GGAAGATCAG GGTTGTGTCC GTATTGAGAA TGTGGAGACX: CTTGTTCTAC AAGCACCCAA 3 SO 

6GACTCTTTT GCCCtaAAGA GCAATGAAOG G6GAATTGGC CAAGCCaCAC ACAGGTTCAC 420 

CTTTTCCCAG ATCTTTGGGC CAGAAGTGGG ACAGGCATCC TTCTTCAACC TAACTGTGAA 480 

QGAGATGGTA AAGGATGTAC TCAAAGGGCA GAACTGGCTC ATCTATACAT ATGGA6TCAC 540 

TAACTCAGGG AAAACCCACA CGATTCAAQG TACCATCAAG QATGGAGG6A TTCTCCX:CCX3 600 

GTCCXrrGGOG CTGATCTTCA ATAGCCTCCA AQGCCAACTT CATCCAACAC CTGATCTGAA 660 

GCCCTTGCTC TCCAATOAGG TAATCTGGCT AGACAGCAAG CAGATCCGAC AGGAGGAAAT 720 

6AAGAAGCTO TCCCTGCTAA ATGGAGGCCT CCAAGAGGAG GAGCTGTCCA CTTCCTTGAA 780 

GAGGAQTGTC TACATCQAAA 6TCGGATAGG TACCAGCACC AGCTTCGACA GTGGCATTGC 840 

TGGGCTCTCT TCTATCAGTC AGTGTACCAG CAGTAGCX3UG CTGQATQAAA CAAGTCAT08 900 

ATGGGCACAG CC3W3ACACTG CCCCACTACC TGTCCCX3GCA AACATTCGCT TCTCCATCTG 960 

GATCTCATTC TTTGAQATCT ACAACGAACT GCTTTATGAC CTATTAGAAC CGCCTAGCCA 1020 

ACAG0GC3WIG AGGCAGACTT TGCGGCTATG CGAGQATCAA AATGGCAATC CCTATQTGAA 1080 

AGATCTCAAC TGGATTCATG TGCAAQATGC TGAGGAGGCC TGGAAGCTCC TAAAAGTGQG 1140 

TC6TAAGAAC CAGAGCTTTG OC3U3CACCCA CCTCAACCAO AACTCCAGCC OCAGTCACAO 1200 

CATCTTCTCA ATCAGGATCC TACACCTTCA GGGGGAAG6A GATATAOTCC CCAAGATCAG 1260 

OGAGCTGTCA CTCTGTGATC TGGCTGGCTC AGAGOQCTGC AAAGATCRGA AGAGTGGT6A 1320 

ACGGTTGAAG GAAGCAGGAA ACATTAACAC CTCTCTACAC ACCCTGGGCC GCTGTATTGC 1380 

TGCXCTTCGT CAAAACCAGC AGAACGGGTC AAAGCAGAAC CTGGTTCCCT TC06IGACAG 1440 

CAAGTTGACT GGAQTGTTCC AAGGTTTCTT CACAGGCGQA GGCCGTTCCT GCATGATT6T 1500 

CAAT6TGAAT CCCTGTGCAT CTACCTATQA TGAAACTCTT CATGTG6CCA AGTTCTCAGC 1S60 

CATTGCTA6C CAGGTSACTT GTOCATGCCC CACCTATGCA ACTGGGATTC CCATCCCTGC 1620 

ACTCGTTCAT CAAGGAACAT AGTCTTCAGG TATCCCCCAG CTTAGAGAAA GGGGCTAAGG 1S80 

CAGACACAQ6 CCTTGATGAT GATATTGAAA ATGAA6CTGA CATCTCCATG TATGGCAAAQ 1740 

ASGAGCTCCT ACAAOTTGTG GAAGCCATGA AGACACTGCT TTEQAAGGAA GGACAOGAAA 1800 

A6CTACAGCT GQAGATGCAT CTCGQAGATG AAATTTGCAA TOAGATGOTA GAACAGATGC 1660 

AACAGCGGGA ACAGTGGTGC AGTGAACATT TGGACACCCA AAAGGAACTA TTGQAGGAAA 1920 

TGTATGAAGA AAAACTAAAT ATCCTCAAGG AGTCACTGAC AAGTTTTTAC CAAGAAGA6A 1980 

TTCAGGAGOG GGATGAAAAG ATTGAA6A6C TAGAAGCTCT CTTGCA6GAA GCCAG ACAA C 2040 

AGTCAGIG6C CCATGAGCAA TGAGGGTCTG AATTGGCOCT AGGGCX33TCA CAAAGGTTGG 2100 

CAGCTTCT6C CTCCACCCAG CA6CTTCAGG A6GTTAAAGC TAAATTACAG CAGTGCftAAG 21S0 

C3W5AGCTAAA CTCTACCACT 6AAGAGTTGC ATAAGTATCA GAAAATGTTA GAACCACCAC 2220 

CCTCAGCCAA GCCCTTCACC ATTGATGTGG ACAAQAAGTT AGAAGAGGGC CAGAAGAATA 2280 

TAAGGCTGTT GG6GACA6AG CTTCAGAAAC TTGGTGAGTC TCTCCAATCA GCAGAGAGAG 2340 

CrrOTTGCCA CAGCACTGGG GCAGGAAAAC TTGGTCAA6C CTTGACCACT TGTGATGACA 2400 

TCTTAATGAA ACAGGACCAG ACTCIGGCT6 AAGTGCAQAA CAACATGGTG CTAGTGAAAC 2460 

TGGACCTTOO GAAGAAGGCA GCATGTATTG CTGAGCAGTA TCATACTGTG TTGAAACTCC 2520 

AAGGCCAGGT TTCTGCCAAA AAGCGCCTTG GTACCAACOV GQAAAATCAG CAACCAAACC 2580 

AACAACCACC AGGGAAGAAA CCATTCCTTC GAAATTTACT TCCCCGAACA CCAACCTGCC 2640 

AAAGCTCAAC AQACTGCAGC CCTTATGCCC GGATCCTACQ CTCACGGCGT TCCCCTTTAC 2700 

TCAAATCTGO GCCTTTTG6C AAAAAQTACT AAGGCTGTGG GGAAAGAGAA GAGCAGTCAT 2760 

6GCCCTGAGG TGGGTCAGCT ACTCTCCTGA AGAAATAGGT CTCTTTTATG CTTTACCATA 2820 

TATCAGGAAT TATATCCAGG ATGC3VATACT CAGACACTAG CTTTTTTCTC ACTTTTGTAT 2880 

TATAACCACC TAT6TAATCT CATGTTGTTG TTTTTTTTTA TTTACTTATA TGATTTCTAT 2940 

6CACACAAAA ACAGTTATAT TAAAGATATT ATTGTTCACA TTTTTTATT6 AATTCCAAAT 3000 
GTAGCAAAAT CATTAAAACA AATTATAAAA GG6ACAGAAA AA 

Seq ID NO: 26 Protein sequence: 
Protein Accession #: Eos sequence 

1 11 21 31 41 51 

I I I I I I 

MSQGIIjSPPA GItLSDKTVW 8PMFE8TAAD LGSWRIOIIiZj SDCSWSTSL S3KQQVPSED 60 

SMEKViCVYLR VRFLLPSELE ROEDQGCVRI EETVETZiVLQA PXDSFALZCSN BRGIGQATHS 120 

FTPSQIFGPE VGQASFFNLT VKEMVKDVLK GQNWLIYTYG VTNSGKTHTI QGTIKDGGIL 180 

PRSIiALIFNS liQGQLHPTPD LKPLLSNEVI WLDSKQIRQE EMKKLSLLNG GLQEEELSTS 240 

LKRSVYIBSR IGT8TSFDSG lAGLSSISQC TSSSQLDBTS HRHAQPDTAP LPVPANIRFS 300 
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IHISFFEIYN EXiLYDLLBFP SQQRKRQTLR LCEDQM6HPY VKDLNWIHVQ DAEEilWXIiIiK 360 

VGRKNQSFAS TBUiQNSSRS BSIFSZRZLH LQGEQDZVPK ISEli£I«CDIA GSERCKDQKS 420 

6ERLXBAG1TI NTSLHTLGRC lAALRQMQW RSKQNLVFFR DSKLTRVFQO FFTQRGRSCM 480 
IVNVKPCAST YDETLSVAKF SAIASQVTCA CPTYATGIPI PAIiVHQGT 

Seq ID NO: 27 DKA sequence 

Nucleic Acid Accession #: Eos sequence 

Coding sequence t 13*1424 

1 11 21 31 41 51 

I I I I I I 

TAQUWSTTTA OUVTGAAGTT TCTTCTAATA CTGCTCCTGC AGGCCACTGC TTCTGGAGCT 60 

CTTCOCXTGA ACAGCTCTAC AAGCCTGGAA AAAAATAATG TGCTATTTGG TQAAAGATAC 120 

TTAGAAAAAT TTTATGGCCT TGAGATAAAC AAACTTCCAG TGACAAAAAT GUUUITATAGT 180 

GGAAACTTAA TGAAGGAAAA AATCCAA6AA ATGCAGCACT TCTTGGGTCT GAAAGTOACC 240 

GGGCAACTGG ACACATCTAC CCTGGAGATG ATGCACGCAC CTCGATQTGG A6TCCC00AT 300 

GTGCATCATT TCAGGGAAAT GCCAGGGGGG CCCGTATG6A GGAAACATTA TATCACCTAC 360 

AGAATCAATA ATTACACACC TGACATGAAC OQTGAGGATG TTGACTACXSC AATC06GAAA 420 

GCTTTCCAAG TATGGAGTAA TGTTACCCCC TTGAAATTCA GCAAGATTAA CACAG6CATG 480 

GCTGACATTT TGGTGGTTTT TGCCCX3TGQA GCTCATGGAG ACTTCCATGC TTTTQATOGC 540 

AAAGGTGGAA TCCTAGCCCA TGCTTTTGOA CCTGQATCTQ GCATTG6AGG GGATGCACAT 600 

rraSATGAGG AOGAATTCTQ QACTACACAT TCAGGAGGCA CAAACTTGTT CCTCACTGCT 660 

GTrCAOGAGA TTGGCCATTC CrTAGOTCTT GGCCATTCTA GTGATCCAAA G6CCGTAATG 720 

TTOCCCACCT ACAAATATGT TQACATCAAC ACaTTTOGCC TCTCTGCTGA TGACATAGQT 780 

GGCATTCAGT CCCTGTATGG AGACOCAAAA GAGAAOCAAC OCTTGOCAAA TOCTSACAAT 840 

TCAGAACCAG CTCTCTGTGA CCCCAATTTG AGTTTTGATG CTGTCACTAC GGTG6GAAAT 900 

AAQATCTTTT TCTTCAAAGA CAGGTTCTTC TGGCTGAAGG TTTCTGAGAG ACCAAAflACC 960 

AOTGTTAATT TAATTTCTTC CTTATGGCCA ACCTTGCCAT CTGGCATTGA AGCTGCTTAT 1020 

GAAATTGAAG CCAGAAATCA AGTTTTTCTT TTTAAAGATG ACAAATACTO GTTAATTAGC 1080 

AATTTAAGAC CAQA6CCAAA TTATCOCAAG AGCATACATT Crff T Ga - lT T TGCTAACTTT 1140 

6TGAAAAAAA TTGATGCAGC TGTTTTTAAC CCACXJTTTTT ATAGGACCTA CTTCTPTGTA 1200 

GATAACCAGT ATTGGAGGTA TGATGAAA6G AGACAGATGA TG6ACCCTGG TTATCCCAAA 1260 

CTGATTACCA AGAACTTCCA AGGAATCGGG CCTAAAATTG ATGC3W5TCrT CTACTCTAAA 1320 

AACAAATACT ACTATTTCTT CCAA6GATCT AACCAATTT6 AATATGACTT CCTACTCCAA 1380 

GGTATCACCA AAACACTGAA AAGCAATA6C iXJ C TIT U Gn ' GTTGAAAAIG GT6XAATTAA 1440 

TGGTTTTTGT TAGTTCACTT CAGCTTAATA AQTATTTATT GCATATTTGC TATGTCCTCA 1500 

GTGTACCACT ACTTAGAGAT ATGTATCATA AAAATAAAAT CTGTAAACCA TAGGTAATGA 1560 

TTATATAAAA TACATAATAT TTTTCy^ATTT TGAAAACTCT AATTGTCCAT TCTTGCTTGA 1620 

CTCTACTATT AAOTTTQAAA ATAGTTACCT TCAAAGCAA6 ATAATTCTAT TTGAAGCATG 1680 

CTCTSTAAST T6CTTCCTAA CATCCTTSGA CTGA6AAATT ATACTTACTT CIX36CATAAC 1740 
TAAAATTAAG TATATATATT TTGGCTCAAA TAAAATIG 

Seq ID NO: 28 Protein sequence: 
Protein Accession #t Bos sequence 

I 11 21 31 41 51 

1 I I I I 1 

MKPIililLLLQ ATASGALPUI SSTSLEXNNV LFGERYLEKF YGIiEINKLPV TKMKYS^ailiM 60 

REKIQBMQHF LGLKVTGQIiD TSTLEMKSAP RGGVFDVHBF REKPGGPVNR KBYITYRINN 120 

YTFDMNREDV DZAISKAFQV WSNVTPIiKFS KIMTGMADIIt WFARGAHGD FHAFOGKOGI 180 

IiAHAEGPGSG IG6DAHFDED EFWTTHSGGT NLFLTAVHEZ GBSLGLGBSS OPKAVMPPTY 240 

KYVDINTFRL SADDZRGIQS LYGDPKENQR LPNPDNSEPA LCDPNLSFDA VTTVGNKIFF 300 

PKDRPPWLKV SERPKTSVNL ISSIiWPTIiPS GIEAAYEIEA RNQVFLFKDD KYWLISNLRP 360 

BPNYPKSIHS P6FPNFVKKI DAAVFNPRFY RTyPPVDNQY WRYDKRRQMM DPOYPKLITK 420 
NFQ6IGPKZD AVFY8KN1CYY YFFQ6SNQFE YDFUX^RZTK TLKSNSWFGC 

Seq ID NO: 29 DNA sequence 

Nucleic Acid Accession #: NM_006115.1 

Coding sequence* 236.. 1765 

1 11 21 31 41 51 

I I I I I I 

GCTTCAGGGT ACAGCTCCXX: CGCAGCCAGA AGCCGGGCCT GCAGCCCCTC AGCA006CTC 60 

COGGACaCCC CACCCGCTTC CCAGGOSTGA CCTGTCAACA GCAACTTCGC GGTGTX3GT6A 120 

ACTCTCTGAG GAAAAACCAT TTTGATTATT ACTCTCAGAC GTGCX3TGGCA ACAAGTGACT 180 

QAGACCTAGA AATCCAAGCG TTGGAGGTCC TGAGGCCAGC CTAAGTCGCT TCAAAATGGA 240 

ACGAAGGCGT TTGTGGGGTT CCATTCAGAG CGGATACATC AGCATGA6T6 TOTGGACAAG 300 

CCCAOGGAGA CTTCTrGGAGC TQ6CAGG6CA GAGCCTGCTG AAGGATOAGG C0CT6G0CAT 360 

TGCCGOCCTG GAGTTGCTGC CCAGG6AGCT CTTCCCGCCA CTCTTCATGG CA6CCTTTGA 420 

CGGGAGACAC AGCCAGACXX: TQAAGGCAAT GGTGCAGQCC TGGCCCTTCA CCTGCCTCCC 4G0 

TCTGGGAGTG CTGATGAAGG GACAACATCT •TCACCTGGAG ACCTTCAAAG CrGTGCTTGA 540 

TGGACTTGAT GTGC7CCTTG CCCAGGAGGT TOQCCCCAGG AGGTGGAAAC TTCAAGTGCT 600 

GGATTTACG6 AAGAACTCTC ATCAGGACTT CTGOACIGTA TGGTCIGGAA ACAGGGCGAG 660 

TCTGTACTCA TTTCCAGAGC CAGAAGCAGC TCAGOCCATG ACAAA6AAGC GAAAAGTAGA 720 

TGGTTTGAGC ACAGAGGCAO AQCAGCCCTT CATTCCAGTA GAGGTGCTOG TAGACCTGTT 780 

CCTCAAGGAA GGTQCCTGTG ATGAATTGTT CTCCTACCTC ATTGAGAAAG TGAAGOGAAA 840 

GAAAAATGTA CTA06CCTGT 6CTGTAAGAA GCXGAAGATT TTTGCAATGC CCATGCAGGA 900 

TATCAA6ATQ ATCCTGAAAA TGGTQCAGCT GGACTCTATT GAAGATTTGO AAOTQACTTG 960 

TACCTGGAAG CTACCGACCT TGGOGAAATT TTCTCCTTAC CTGGGGCA6A TGATTAATCT 1020 

6CGTAGACTC CTCCTCTCCC ACATCCATGC ATCTTCCTAC ATTTCCCC3QG AQAAGGAAGA 1080 

GCAGTATATC GCCXaVGTTCA CCTCTCAGTT CCTCAGTCTG CAGTGOCTGC AG6CTCTCTA 1140 

T6TGGACTCT TT A TTTTTCC TTAGAGGCCG CCTGGATCAG TTGCTCAGGC A06TGATGAA 1200 

CCXXTTGGAA ACCCTCTCAA TAACTAACTG OGQGCTTTCG OAAGGGGATQ TGATOCATCT 1260 

GTCCCAGAGT CCCAGGGTCA OTCAGCTAAG TSTCCTGAGT CTAAOTG0G6 TCAT6CTGAC 1320 

OGATGTAAGr CCCGAGCCCC TCCAAQCTCT GCTGGAGAGA GCCTCTGCCA CCCTCCAGGA 1380 

CCTGGTCTTT GATGAQTGTG GGATCACXSGA TGATCAGCTC CTTGCCXnTCC TGCCTTCCCT 1440 

6AGCCACTGC TCCCAGCTTA CAACCTTAAG CTTCTACGGG AATTCX3VTCT CCATATCTGC 15 OO 
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CTTGCAGAGT CTCCTGCAGC ACCTCA.TGG6 GCTGA6CAAT CTGACCCACG TG CTGTA TCC 1560 

TGTCCCCCTG GAGAGTTATG AGGACATCCa TGGTAOCCTC CACCTGGAGA OOCTTOOCTA 1620 

TCTGCATGCC AGGCTCAGGG AOTTGCTGTG TGAGTTGGGG CGGCCCAQCA TGGTCTGGCT 1680 

TAGTGCCAAC CCCTGTCCTC ACTGTGGGGA CAGAACCTTC TATQACC0G6 AGCCCATCCT 1740 

GTGCCCCTGT TTCATGCCTA ACTAGCIGGG TOCACRTATC AAATGCTTCA TTCTGCATAC 1800 

TTGGACACTA AAGCCAGGAT GTGCATGCAT CTTGAAQCAA CAAAGCAGCC ACAGTTTCAG 1860 

ACAAATGTTC AGTGTGAGT6 AGGAAAACAT GTrCS^jSTOAG GAAAAAACAT TCA6ACAAAT 1920 

GTTCAGTGAG GAAAAAAAGG GGAAGTTGGG GATAGGCAGA TQTTGACTTG AGGAGTTAAT 1980 

GTGATCTTTG GGGAGATACA TCTTATAGAG TTAGAAATAG AATCTGAATT TCTAAAG6GA 2040 

GATTCTGGCT TGGGAAGTAC ATGTAGGAGT TAATCCCTGT GTAGACT6TT GTAAA6AAAC 2100 
TGTTGAAAAT AAA6AGAAGC AAT6TGAAGC AAAAAAAAAA AAAAAAAA 

Seq ID NO: 30 Protein sequence: 
Protein Accession #s NP_006106.1 

1 11 21 31 41 51 

GCTTCAGGGT ACAGCTCCCC CGCAGCCAGA AGCCX3GGCCT GCA6CX3CCTC AGCACCGCTC 60 

CGGGACACCC CACCCGCTTC CCAGGCX5TGA CXTTOTCAACA GCAACTTOGC GGTGTGGTGA 120 

ACTCTCTGAG GAAAAACCAT TTTGATTATT ACTCTCAGAC GTGCGTGGCA ACAAGTGACT 180 
6AGACCTAGA AATCCAAGCG TTGGAGGTCC TGAGGCOkGC CTAAGTCGCT TCAAAATGGA 
AOGAAOGOGT TTQTQGGGTT CC»TTCAGAO OCQATACATC A6CATGA0TG TGTGGACAAG 
CCCACGGAGA CTTGTGGAGC TG6CAG6GCA GAGCCTGCTG AASGATGAGG CCCTGGCCAT 

TGCCGCCCTG GAGTTOCTGC CCAGGQAGCT CTTCCCGCCA CTCTTCATGG CAGCCTTTGA 420- 

CGGGAGACAC AGCCAGACCC TGAAGGCAAT GGTGCAGGCC TGGCCCTTCA CCTQCCTCCC 480 

TCTGGGAGTG CTGATGAAGG GACAACATCT TCACCTGGAG ACCTTCAAAG CTGTGCTTGA 540 

TCGACTTOAT 6TGCTCCTTG CCCftGOAGGT TCOCOCCAGG AGGTGGAAAC TTCAAGTGCT SOO 

GGATTTACX3G AAGAACTCTC ATCAGQACTT CTGOACTGTA TGGTCTGGAA ACAGGOCCAG 660 

TCT6TACTCA TTTCCAGAGC CAGAAGCAGC TCAGCCCAT6 ACAAAGAAGC GAAAAGTAGA 720 

TGGTTTGAGC ACAGAGGCAG AGCAGCCCTT CATTCCAGTA GAGGTGCTCG TAGACCTGTT 780 

CCTCAAGGAA QGTGCCTGTG ATGAATTGTT CTCCTACCTC ATTGAGAAAG TGAAGCXSAAA 840 

QAAAAATGTA CTAOGCCTGT GCTGTAAGAA GCTGAAGATT TTTGCAATGC CCATGCAGGA 900 

TATCAAGATG ATCCTGAAAA TGGTGCA6CT GGACTCTATT GAAGATTTGG AAGTGACTTG 960 

TACCTGGAAG CTACCCACCT TGGCGAAATT TTCTCCTTAC CTGGGCCA6A TGATTAATCT 1020 

GCGTAGACTC CTCCTCTCCC ACATCCATQC ATCTTCCTAC ATTTCCCOGO AQAAGGAAGA 1080 

GCAGTATATC GCX:CAGTTCA CCTCTCAGTT CCTCAGTCTG CAGTGCCTGC AGGCTCTCTA 1140 

TGTGGACTCT TTATTTTTCC TTAGAGGCOG CCTGGATCAG TTGCTCAGGC ACGTGATGAA 1200 

COCXTTTGOAA AOCCTCTCAA TAACTAACTG COGGCTTTCG GAAGGGGATG TGATGCATCT 1260 

GTCCCAGAGT CCCAC3CX3TCA GTCAGCTAAG TGTCCTGAGT CTAAGTGGGG TCATGCTGAC 1320 

CGATGTAAGT CCCGAGCXICC TCCAAGCTCT GCTGGA6ASA GCCTCTGCCA CGCTCCAOGA 1380 

CCT6GTCTTT GATQAOTGTG QGATCAGGGA TGATCAGCTC CTTGCCCTCC TGOCTTCCCT 1440 

QAGCCACTGC TCCCAGCTTA CAAOCTTAAG CTTCTACGGG AATTCCATCT CCATATCTGC 1500 

CTTGCAGAGT CTCCTGCAGC ACCTCATCGG GCTGAGCAAT CTGACCCACG TGCTGTATCC 1560 

TGTCCCCCTG GAGAGTTATG AGGACATCCA TGGTACCCTC CACCTGGAGA GGCTTG OCTA 1620 

TCTGCATGCC AGGCTCAGGG AGTTGCTGTG TGASTTGGOG C6GCCCAOCA TGGTCTGGCT 1680 

TAGTGCCAAC CCCTGTCCTC ACTGTGGGGA CAGAACCTTC TATGACCCOO AGCCCATCCT 1740 

OTGCCCCTCT TTCATGCCTA ACTA6CTGGG TGCACATATC AAATGCTTCA TTCTGCATAC 1800 

TTGGACACTA AAGCCAGGAT GTGCATGCAT CTTGAAGCAA CAAAGCAGCC ACAGTTTCAG 1860 

ACAAATGTTC AGTGTGAGTQ AGGAAAACAT GTTCAGTGAG GAAAAA ACAT TCAGACAAAT 1920 

GTTCAGTGAG GAAAAAAAGG GGAAGTTQQO GATAGGCAGA TGTTGACTTG AQGAGTTAAT 1980 

GTGATCTTTG GGGAGATACA TCTTATAGAG TTAGAAATAG AATCTGAATT TCTAAAGOGA 2040 

GATTCTGGCT TGGGAAGTAC ATGTAGGAGT TAATCCCTGT GTAGACTGTT GTAAAGAAAC 2100 
TGTTGAAAAT AAAGAGAAGC AATGTGAAGC AAAAAAAAAA AAAAAAAA 



240 
300 
360 



Seq ID NO: 31 UNA sequence 

Nucleic Acid Accession #: Bos sequence 

coding sequence i 64-2754 

1 11 21 31 41 51 \ 

GGCAGGTCTC GCTCTOGGCA CCCTCCCGGC GCCCGCGTTC TCCTGGCCCT GCCCGGCATC 60 

CCGATGGCCG CCGCTGGGCC CCGGCGCTCC GTGCGCGGAG CCGTCTGCCT GCATCTGCTG 120 

CTGACCCT06 TGATCTTCAG TCGTGATGGT GAAGCCTGCA AAAAGGTGAT ACTTAATGTA 180 

CCTTCTAAAC TAOAOGCAGA CAAAATAATT GGCAGAGTTA ATTTGGAAGA GTGCTTCAGG 240 

TCTGCAGACC TCATCCGGTC AAGTGATCCT GATTTCAGAG TTCTAAATGA TGGGTCAGTG 300 

TACACAGCCA GGGCTGTTGC GCTGTCTGAT AAGAAAAGAT CATTTACCAT ATGGCTTTCT 360 

GACAAAAGGA AACA6ACACA GAAA6AGGTT ACTGTGCTGC TAGAACATCA GAAGAAGGTA 420 

TCGAAGACAA GACACACTAG AGAAACTGTT CTCAGGCGTG CCAAGAGGAG ATGGGCACCT 480 

ATTCCTTGCT CTAT6CAAGA GAATTCCTTG GGCCCTTTCC CATTGTTTCT TCAACAAGTT 540 

GAATCTGATG CAGCACAGAA CTATACTGTC TTCTACTCAA TAAGTGGAOG TGGAGTTGAT 600 

AAAGAACCTT TAAATTTGTT TTATATAGAA AGAGACACTG GAAATC TATT TT6CACTCGG 660 

CCTGTGGATC GTGAAGAATA TGATGTTTTT GATTTGATTG CTTATGCGTC AACTGCAGAT 720 

GGATATTCAG CAGATCTGCC CCTCCCACTA CCCATCAGGG TA6AGGATGA AAATGACAAC 780 

CACCCTGTTT TCACAGAAGC AATTTATAAT TTTGAAGTTT TGGAAAGTAG TAGACCTGGT 840 

ACTACAGIGG GGGTGGTTTG TGCCACAGAC TIGAGATGAAC CGGACACAAT 6CATACGCGC 900 

CTGAAATACA GCATTTTGCA GCAGACACCA AGGTCACCTG GGCTCTTTTC TGTGCATCCC 960 

A6CACAGGCG TAATCACCAC AGTCTCTCAT TATTTGGACA GAGAGGTTGT AGACAAGTAC 1020 

TCATTGATAA TGAAAQTACA AGACATGGAT GGCCAGTTTT TTGGATTGAT AGGOICATCA 1080 

ACTTGTATCA TAACAGTAAC AGATTCAAAT 6ATAATGCAC CCACTTTCAG ACAAAATGCT 1140 

TATGAAGCAT TTGTAGAGGA AAATGCATTC AATGTGGAAA TCTTACGAAT ACCTATAGAA 1200 

GATAAGGATT TAATTAACAC TGCCAATTGG AGAGTCAATT TTACCATTTT AAAGGGAAAT 1260 

GAAAATGGAC ATTTCAAAAT CAGCACAGAC AAAGAAACTA ATGAAGGTGT TCTTTCTGTT 1320 

GTAAAGCCAC TGAATTATGA AGAAAACCOT CAAGTGAACC TGGAAATTQG AGTAAACAAT 1380 

GAAGCGCCAT TTGCTAGAGA TATTCCCAGA GTGACAGCCT TGAACAGAGC CTTG6TTACA 1440 

GTTCATGTGA GGGATCTGGA TGAGGGGCCT GAATGCACTC CTGCAGCCCA ATATGTGCGG 1500 

ATTAAAGAAA ACTTAGCAQT GGGGTCAAAG ATCAACOGCT ATAAGGCATA TGACCCOGAA 1560 
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AATAQAAATO GCAATGGTTT AAG6TACAAA AAATTGCATG ATCCTAAAGG TTGGATCACC 1620 

ATTGAT6AAA TTTCM3G6TC AUTCATAACT TC3CAAAATCC TGGATAGGGA QOTTGAAACT 1680 

CCCAAAAATO AGTTQTATAA TATTACW3TC CTGGCAATAQ AC3\AAGATGA TAGATCATOT 1740 

ACTGGAACAC TTGCTGTGAA CATTGAAGAT OTAAATGATA ATCCACCAGA AATACTTCAA 1800 

GAATATGTAG TCATTTGCAA ACCAAAAATG GGOTATACCQ ACATTTTAGC TGTTGATCCT 1860 

GATGAACCTG TCCATGQAGC TCCATTTTAT TTCAffTTTGC CCAATACTTC TCCAGAAATC 1920 

AGTAGACTGT OGAOCCTCAC CAAAOTTAAT OATACAGCTO CCCGTCTTTC ATATCAGAAA 1980 

AATGCTGGAT TTCAAGAATA TACCATTCXTT ATTACTGTAA AAGACAGGGC CGGCCAAGCT 2040 

QCAACAAAAT TATTGAGAGT TAATCTGTGT GAATGTACTC ATCCAACTCA GTGTCGTGCQ 2100 

ACTTCAAGGA OTTACAGGAGT AATACTTGGA AAATGGGCAA TCCTTGCAAT ATTACTGG6T 2160 

ATAGCACTGC TCTTTTCTGT ATTGCTAACT TTAGTATGTG GAGTTTTTGG TGCAACTAAA 2220 

GGQAAAOSTT TTCCTGAAGA TTTAGCACAG CAAAACTTAA TTATATCAAA CACAGAAGCA 2280 

CCTGGAOACG ATAOAGTGTO CTCTGCCAAT GGATTTAT6A CCCAAACTAC CAACAACTCT 2340 

AGCCAA6GTT TTTCSIGGTAC TATGGGATCA GGAATGAAAA ATGQAGGGCA GOAAACCATT 2400 

GAAAT6ATGA AAGGAGGAAA CCAGACCTTG GAATCCTGCC GGQGGGCTGG GCATCATCAT 3460 

ACCCPGGACT OCTGCAGGGG AGGACACACG GAGGTGGACA ACTGCAQATA CACTTACTOG 2520 

GA6TG6CACA GTTTTACTCA ACCCCX5TCTC QGTGAAAAAT TGCATOSATG TAATCAGAAT 2580 

GAAGACC6CA T6CCATCCCA AQATTATGTC CTCACTTATA ACTATGAGGG AAGAGQATCT 2640 

CCAGCTGGTT CTGTGGGCTG CTGCAGTGAA AAGCAGGAAG AAGATGGCCT TGACTTTTTA 2700 

AATAATTT6G AACCCAAATT TATTACATTA GCAGAAGCAT QCACAAAGAG ATAATGTCAC 2760 

AGTGCTACAA TTAGGTCTTT GTCAGACATT CTGQAG6TTT CCAAAAATAA TATTGTAAAG 2820 

TTCAATTTCA ACATGTATGT ATATGATQAT TTTTTTCTCA ATTTTGAATT ATGCTACTCA 2880 

CCAATTTATA TTTTTAAAQC CAGTTGITGC TTATCTTTTC CAAAAAGTGA AAAATGTTAA 2940 

AACAGACAAC TGGTAAATCT CAAACTCCAG CACTGGAATT AAGGTCTCTA AAGCATCTGC 3000 

TCTTTTTTTT TTTTACX3GAT ATTTTAGTAA TAAATATGCT G6ATAAATAT TAOTCCAACA 3060 

ATAGGTAAGT TATGCTAATA TCACATTATT ATGTATTCAC TTTAAGTGAT AOTTTAAAAA 3120 

ATAAACAAGA AATATTGAGT ATCACTATGT GAAGAAAGTT TTG6AAAAGA AACAATGAAG 3180 

ACTQAATTAA ATTAAAAATG TTGCAGCTCA TAAAGAATTG GGACTCACCC CTACTGCACT 3240 

ACCAAATTCA TTT6ACTTTG 6AGGCAAAAT GTGTTGAAGT GCCCTATGAA GTAGCAATTT 3300 

TCTATAGGAA TATAGTTGGA AATAAATGTG TGTGTGTATA TTATTATTAA TCAATGCAAT 3360 

ATTTAAAATG AAATGAGAAC AAAGAGGAAA ATGGTAAAAA CTTGAAATGA GOCTGGOGTA 3420 

TAOTTTGTCC TACAATAGAA AAAAGAGAGA GCTTOCTAGO CCTGGGCTCT TAAA TGCTGC 3480 

ATTATAACTG AGTCTATGAG GAAATAGTTC CTGTCX3ATT TGTGTAATTT GTTTAAAATT 3540 

GTAAATAAAT TAAACTTTTC TG6TTTCTGT GGGAAGGAAA TAGG6AATCC AAT6GAACAG 3600 

TAGCTTTGCT TTGCS^OTCTO TTTC3UU1ATT TCIOCATCCA CAAOTTAGTA GCAAACTGG6 3660 

QAATACTOQC T6CAGCTGGQ OTTCCCTGCT TTTTG6TAGC AAGGGTCCAG AOATGAGGTG 3720 

TTTTTTTCGG GGAGCTAATA ACAAAAACAT TTTAAAACTT ACCTTTACTC AAGTTAAATC 3780 

CTCTATTGCT CTTTTCTATTC TCTCTTATAG TGACCAACAT CTTTTTAATT TAOATCCAAA 3840 

TAACCATGTC CTCCTAGAGT TTAGAGGCTA GA0GQA6CTG AGGGGASGAT CTTACTGAAA 3900 

GChOCCIGGG GAGATTGATT GTCCTTAAAC CTAA6CCCCA CAAACIT6AC ACCTGATCAa 3960 

GTCTGGGAGC TACAAAATTT CATTTTTCTC CTCACTGCOC TTCTTCXtSAG TGGCATTG6C 402O 

CTGAATCAAG 6AAAGCCAGG CCTTGTGGGC CCCCTTCTTT OGQCTTTCTG CTAAAGCAAC 4080 

ACCTCCAGCA GAGATTCCCT TAAGTGACTC CAGGTTTTCC ACCATCCTTC AGCGTGAATT 4140 

AATTTTTAAT CAGTTT6CTT TCTCCAGAGA AATTTTAAAA TAATAGAAGA AATA6AAATT 4200 

TTGAATGTAT AAAAQAAAAA GATCAAOTTG TCATTTTAGA ACAGAGGGAA CTTTGGGAGA 4260 

AA6CAGCXX:A AGTAGQTTAT TT6TACA8TC AOAGGGCAAC A6GAAGAT6C AGGCCTTCAA 4320 

GGGCAAGGAQ AGGCCACAA6 GAATATGGGT GGGAGTAAAA GCAACATCGT CTGCTTCATA 4380 

CTTTTTOCTA GGCTTGGCAC TGCCTTTTCC TTTCTCAGGC CAATGGCAAC TGCCATTTGA 4440 

GTCOGGTOAG GGATCAGC2CA ACCTCTTCTC TATGGCTCAC CTTATTTGGA GTGAGAAATC 4500 

AAGOAGACAO AGCIGACTGC ATGAT6AGTC TGAAGGCATT T6CA6QAIGA GCCTGRACTG 4560 

GTTGTGCAGA ACAAACAAG6 CATTCATGG6 AATTGTTGTA TTCCTTCTGC AGCCCTCCTT 4620 

CTGGGCACTA AGAAGGTCTA TGAATTAAAT GCCTATCTAA AATTCTGATT TATTCCTACA 4680 

TTTTCTOTTT TCTAATTTGA CXX:TAAAATC TATGTGTTTT AGACTTAGAC TTTTTATTGC 4740 

cccccccxrcc ' m ' rrrmxj agacggagtc tcoctctgac gcacaqgctg gagtgcagtg 480o 

GCTCCQATCT CTGCTCACTO AAAGCTCOGC CTCCCaGOTT CAMCCATTC TCCTGC CTCA 4860 

GCCTCCTGAB TAGCTGGGAC TACAG6CGCC GACCACCACX3 C00G6CTAAT TTTTTGTATT 4920 

TTTAATAGAG ACGGGGTTTC ACTOTGTTAO CCAGGATGOT CTCGATCTCC TGACCTCGTG 4980 

ATCCGCCTGC CTCGQCCTCC CAAAGTGCTG GGATTACAGG CATGACCCAC C3GCTCCCGGC 5040 

CTTGTTTTCC GTTTAAAGTC GTCTTCTTTT AATGTAATO^ TTTT6AACAT GTGTGAAAGT 5100 

TGATCATAC6 AATTGGATCA ATCTTGAAAT ACTCAACCAA AAGACAGTOG A6AAGCCASG 5160 

GGGAGAAAOA ACTCAG6GCA CAAAATATTG GTCTOAGAAT GGAATTCTCT GTAAGCCTAG 5220 

TTGCTGAAAT TTCCTGCTGT AACCAGAAGC CAGTTTTATC TAACGGCTAC TGAAA CACC C 5280 

ACTQTGTTTT GCTCACTCCC TCACTCACCG ATCAAAACCT GCTACCTCCC CAAGACTTTA 5340 

CTAGTGCCGA TAAACTTTCT CAAAGAGCAA CCAGTATCAC TTCCCTGTTT ATAAAACCTC 5400 

TAACCRTCTC ' mq TT C T T T GAACATGCTG AAAACCACCT GGTCTGCATG TATGCCOQAA 5460 

TTTGTAATTC TTTTCTCTCA AATGAAAATT TAATTTTAGG GATTCATTTC TATATTTTCA 5520 

CATATGTAGT ATTATTATTT CCTTATATGT GTAAGGTGAA ATTTATGGTA TTTGA6TGT6 5580 

CAAGAAAATA TATTTTTAAA GCTTTCATTT TTCCCCCAQT GAATGATTTA GAATTTTTTA 5640 

TGTAAATATA CAGAATGTTT TTTCTTACTT TTATAAGOAA GCAGCTQTCT AAAAT6CAGT 5700 

GGGGTTTGTT TTGCAATGTT TTAAACA6AQ TTTTAGTATT GCTATTAAAA GAAQTTACTT 5760 

TGCTTTTAAA GAAACTTGGC TGCTTAAAAT AAGCAAAAAT TGGATGCATA AAGTAATATT 5820 

TACA6ATGTG GGGAGATCTA ATAAAACAAT ATTAACTTGG TTTCTT6TTT TTQCrGTATT 5880 

TAGA6ATTAA ATAATTCTAA GA1X3ATCACT TTGGAAAATT ATS CTTAT GG CTSGGAT6GA 5940 

AATAGAAATA CTCAATTATG TCTTTGTTGT ATTAATGG6G AATATTTTGG ACAATG TTTC 6000 

ATTATCAAAT TOT06ACATC ATTAATATAT ATT6TAATGT TGGQAAGAGA TCACTATTTT 6060 

GAAGCACAGC TTTACAQATG AGTATCTATG ATACATATGT ATAATAAATT TTOATCXSGGT 6120 

ATTAAAAGTA TTAGAAGGTG GTTATAATT6 CAGA8TATTC CATOAATAGT ACACT6ACAC 6180 

A6G0GTTTTA CTTTGAGGAC CAGrTGTASTG AAGGQAAAAC ATQAGTTAAA A AQAA AAGCA 6240 

GGCAATATTQ CAGTCTTGAT TCTGCCACTT ACAG6ATAGA TAAT6CCTGA ACTTTAATGA 6300 

CAAGATQATC CAACCATAAA GGTGCTCTGT GCTTCAC3W3T GAATCTTTTC CCCATGCAGG 6360 

AGTGTGCTCC CCTACAAAOG TTAAGACTGA TCATTTCAAA AATCTATTAG CTATATCAAA 6420 

A6CCTTAGAT TTTAATATA6 QTTQAACCAA AATTTCAA2T CCAGTAACTT CTATTGTAAC 6480 

CATTATTTTT GTGTATGTCT TCAAQAATGT TCATTG6ATT TTTGTTTGTA ATASTAAAAT 6540 

ACCGGATACA TTTCAGGITGT CCTTCAGTAT TGATTTG6TT GAA7ATT6GG TCATAATGGT 6600 

TQA6AAGCAT GGACACTAGA GCCAGAATGC TTGGATATGA ATCCTGGATC TGTCACTTAC 6660 

TTCTGTQTGA CCTTTGAAAO GCTACTTATT TCCTCTCTTA GCTTTCTCAT TAAAATCAAT 6720 

GAACAATGCC AGCCTCATGG GGTTGTTGAA TGATTAAATT AGTTAATATA CCTAAA6TAC 6780 
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ATAGAACACT GCCTGCACAT AGTAAAAGAA TTATAAGTGT GAGGTAGTTG GTAAAAT TAT 
GTAGTTGGAT ATACTACCGA ACAATATCTA ATCTCTTTTT AGGGAAATAA AGTTTGT6CA 
TATATATAAT CCCGAAACAT G 

Seq ID HOx 32 Protein sequence: 
Protein Accesaion #: NP_ooi932.l 



PCT/US02/12476 



6840 
6900 



KAAAGFRRSV 

ADLIRSSDPD 
KTRHTRBTVL 
EPLHLFYIER 
PVPTBAIYNP 
TGVITTVSHY 
EAPVBENAFN 
KPLKYEENRO 
KENLAVGSKI 
KNELWITVL 
EPVHGAPPYF 
TKLIiRVNLCE 
KRFPEDLAQQ 
I4MKGGKQTLE 
DRMPSQDYVL 



11 
I 

RGAVOiIOiIiL 
FRVLNDGSVY 
RRAKRRWAPI 
DTOILPCTRP 
EVLESSRPGT 
LDREWDKYS 
VEZLRIPIED 
VNLEIGVNNB 
NGYKAYDPEN 
AIDKDDRSCT 
SLPNTSPEIS 
CTHPTQCRAT 
NLIIS OTEAP 
SCXGAGHHHT 
TYMYBGRGSP 



21 

1 

ThVIFSRDGE 
TARAVALSDK 
PCSMQEHSLQ 
VDREEYDVFD 
TVGWCATDR 
LIMKVQDMDG 
KDLINTANWR 
APFABDIPRV 
VSSSSGIjRYKR 
GTLAVNIEDV 
RLHSLTKVND 
SRSTGVILGK 
GDDRVC8ANG 
LDSCRGffilTE 
AGSVGCCSBK 



31 

I 

ACKKVII.NVP 
KRSFTXHLSD 
PFPLFLQQVE 
LZAYASTADG 
DEPDTMHTRL 
QFFGLIGTST 
VMFTILKGNE 
TALNRAIiVTV 
LBDPKGWITI 
NDNPPEILQE 
TAARLSYQKN 
WAILAXIiLGI 
FMTQTTHHSS 
VDNCRYTYSE 
QEEDGIiDFIiN 



41 ' 
I 

SKZiEADKZIG 
KRKQTQKEVT 

YSADLPLPLP 
KYSILQQTPR 
CIITVTDSKD 
HGHFKISTDK 
HVRDLDEOPE 
DEISGSIITS 
YWICKPKMQ 
AGPQEYTIPI 
ALLFSVUjTL 

QOFCxmiasG 

NHSFTOPSLG 
HLEPKFITIA 



51 

1 

RVNLEECFR5 
VLIiEHQKKVS 
YSISGRGVDK 
IRVEDENDMH 
SPGLFSVHPS 
NAPTPRQUAY 
ETNEGVLSW 
CTPAAQYVRI 
KILDREVETP 
YtDILAVDPD 
TVKDRAGQAA 
VG6VFGATX6 
MKNGGQBTIB 

EACTKR 



Sag ID HOt 33 DMA sequence 

Nucleic Acid Accession #: Eos sequence 

coding sequence: 64-2583 



1 

GGCAG6TCTC 
CCGATG6CCG 
CTGACCCTCG 
CCTTCTAAAC 
TCTGCAGACC 
TACACA6CCA 
GACAAAAGGA 
TCGAAGACAA 
ATTCCTTGCT 
GAATCTGATG 
AAAGAACCTT 
CCTGTG6ATC 
GQATATTCAQ 
CACCCTGTTT 
ACTAGAGTG6 
CT6AAATACA 
A6CACAGGCG 
TCATTGATAA 
ACTTGTATCA 
TATGAAGCAT 
QATAAGGATT 
GAAAAT6GAC 
QTAAAGCCAC 
GAAGCGCCAT 
GTTCATGTGA 
ATTAAAGAAA 
AATAGAAATG 
ATTGATGAAA 
CCCAAAAAT6 
ACTGGAACAC 
GAATATGTAG 
GATGAA CCTg 
AOTAOACTGT 
AATGCT6GAT 
GCAACAAAAT 
ACTTCAAQGA 
ATAGCA CTGC 
GGQAAAGOTT 
CCT6GAGAC6 
AGCCAAGGTT 
GAAATGATGA 
AQCCTGGACT 
GAGTOGCACA 
TAAAAATTAA 
CCAAGATTAT 
CTGCTGCAGT 
ATTTATTACA 
TTTGTCAGAC 
TOTATATGAT 
AGCCAGTTGT 
TCTCAAACTC 
GATATTTTA6 
ATATCACATT 
AGTATCACTA 
ATGTTGCAGC 



11 
I 

GCTCTC6GCA 
CCGCTGGGCC 
TGATCTTCAG 
TAGAGGCAGA 
TCATCCGGTC 
GGGCTCTTGC 
AACAGACACA 
GACACACTAG 
CTATGCAAGA 
CAGCACAOAA 
TAAATTTGTT 
GTGAAGAATA 
CAGATCTGCC 
TCACAGAAGC 
GGGTGGTTTG 
GCATTTTGCA 
TAATCACCAC 
TGAAAGTACA 
TAACAGTAAC 
TTGTAGAQGA 
TAATTAACAC 
ATTTCAAAAT 
TGAATTATQA 
TTGCTAGAGA 
GGGATCTGGA 
ACTTAGCAGT 
GCAATGGTTT 
TTTCAGGGTC 
AGTTGTATAA 
TTGCTGTGAA 
TCATTTGCAA 
TCCATG6AGC 
GGAGCCTCAC 
TTGAAGAATA 
TATTGAGAGT 
GTACAGGAGT 
TCTTTTCTGT 
TTCCTGAAGA 
ATAGAGTGTG 
TTTGTGGTAC 
AAGGAGGAAA 
CCTGCA6GGG 
GTTTTACTCA 
ACATAAAAGA 
GTCCTCACTT 
GAAAAGCAGG 
TTAGCAGAAG 
ATTCTGGAGG 
GATTTTTTTC 
TGCTTATCTT 
CAGCACTGGA 
TAATAAATAT 
ATTATGTATT 
TGTGAAGAAA 
TCATAAAGAA 



21 
I 

CCCTCCCQGC 
CCGGOGCTCC 
TOGTGATGGT 
CAAAATAATT 
AAGT6ATCCT 
0CT6TCTGAT 
GAAAGAGGTT 
AQAAACTGTT 
GAATTCCTTG 
CTATACTGTC 
TTATATAGAA 
TGATGTTTTT 
CCTCCCACTA 
AATTTATAAT 
TGCCACAGAC 
GCAGACACCA 
AGTCTCTCAT 
AGACATGOAI 
AGATTCAAAT 
AAATGCATTC 
TGCCAATTGG 
CAGCACAGAC 
AGAAAACCGT 
TATTCCCAGA 
TGAGGGGCCT 
GGGGTCAAAG 
AAGGTACAAA 
AATCATAACT 
TATTACAGTC 
CATTGAAGAT 
ACCAAAAATG 
TCCATTTTAT 
C3Uy^TTAAT 
TACCATTCCT 
TAATCTGTGT 
AATACTTGGA 
ATTGCTAACT 
TTTAGCACAG 
CTCTGCCAAT 
TATGGGATCA 
CCAGACCTTG 
AGGACACAOG 
ACCCOGTCTC 
AATTGCATOG 
ATAACTATGA 
AAGAAGATGG 
CATGCACAAA 
TTTCCAAAAA 
TCAATTTTGA 
TTGCAAAAAG 
ATTAA6GTCT 
GCTGGATAAA 
CACTTTAAGT 
GTTTTGQAAA 
TTGGGACTCA 



31 
1 

GCCG6CGTTC 
GTGCGCGGAQ 
GAAGCCTGCA 
GGCAGAGTTA 
6ATTTCAGAG 
AAGAAAAGAT 
ACTGTGCTGC 
CTGAGGCGTG 
GGCCCTTTCC 
TTCTACTCAA 
A6AGACACTG 
GATTTGATTG 
CCCATCAGGG 
TTTGAAGTTT 
AGAGATGAAC 
AGGTCACCTG 
TATTTGGACA 
GGCCAGTTTT 
QATAIOGCHC 
AATGTGGAAA 
AGAGTCAATT 
AAAGAAACTA 
CAAGTGAACC 
6TGACAGCCT 
GAATGCACTC 
ATCAACGGCT 
AAATTGCAT6 
TCCAAAATCC 
CTQGCAATAG 
GTAAATGATA 
GGQTATACCG 
TTCAGTTTGC 
OATACASCTG 
AXTACTCTAA 
GAA1t3TACTC 
AAATGG6CAA 
TTAGTATGTG 
CAAAACTTAA 
6QATTTATGA 
GGAATGAAAA 
GAATCCTGCC 
GAGGT6GACA 
GGTGAAGAAT 
ATGTAATCAG 
GGGAAGAGGA 
CCTTGACTTT 
GAGATAATGT 
TAATATTGTA 
ATTATGCTAC 
TGAAAAATGT 
CTAAAGCATC 
TATTAGTCCA 
QATAGTTTAA 
AQAAACAATG 
CCXXTPACTGC 



41 



TCCTG6CCCT 
CCGTCTGCCT 
AAAAGGTGAT 
ATTTGGAAGA 
TTCTAAATGA 
CATTTACCAT 
TAGAACATCA 
CCAAGAGGAG 
CATTGTTTCT 
TAAGTGGACG 
GAAATCTATT 
CTTATGCGTC 
TAGAOQATGA 
T66AAA6TAG 
OGGACACAAT 
GGCTCTTTTC 
6AGAGGTTGT 
TTGGATTGAT 
CGACTTTCAG 
TCTTACGAAT 
TTACCATTTT 
ATGAAGGT6T 
TGGAAATTGG 
TGAACAGAGC 
CTGCAGCCCa 
ATAA6GCATA 
ATCCTAAAGQ 
TGX3ATAGGGA 
ACAAAGATGA 
ATCCACCAGA 
ACATTTTAGC 
CCAATACTTC 
COOGTCTTTC 
AAGACAGGGC 
ATCCAACTCA 
TCCTTGCAAT 
6A6TTTTTGG 
TTATATCAAA 
CCCAAACTAC 
ATGGAGGGCA 
GGGGGGCTG6 
ACTGCAGATA 
CCATTAGAGG 
AATGAAGACC 
TCTCCAGCTG 
TTAAATAATT 
CACAGTGCTA 
AAGTTCAATT 
TCACCAATTT 
TAAAACAGAC 
TQCTCTTTTT 
ACAATA6CTA 
AAAATAAACA 
AAGACTGAAT 
ACTACCAAAT 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
640 



51 
1 

GCCCGGCATC 
GCATCTGCTG 
ACTTAATGTA 
GTGCTTCAGG 
TGGGTCA6TG 
ATGGCTTTCT 
GAAGAAGGTA 
ATGGGCACCT 
TCAACAAGTT 
TGGAGTTQAT 
TTGCACTCGG 
AACTGCAGAT 
AAATGACAAC 
TAGACCTGGT 
GCATAOGCGC 
TGTGCATCCC 
AGACAA6TAC 
A6GCACATCA 
ACAAAATGCT 
ACCTATAGAA 
AAAGGGAAAT 
TCTTTCTGTT 
AGTAAACAAT 
CTTGGTTACA 
ATATGTGCGG 
TGACCCC6AA 
TTQGATCACC 
G6TTGAAACT 
TAGATCATGT 
AATACTTC3VA 
TGTTGATCCT 
TCXAGAAATC 
ATATCAGAAA 
OGGCCAAGCT 
GTOTOGTGCG 
ATTACTGGGT 
TGCAACTAAA 
CACAGAAGCR 
CAACAACTCT 
GGAAACCATT 
GCATCATCAT 
CACTTACTCG 
ACACACTGGT 
GCATGCCATC 
GTTCTGTGGG 
TG6AACCCAA 
CAATTAGGTC 
TCAACATGTA 
ATATTTTTAA 
AACT6GTAAA 
TTTTTTTA06 
AGTTAT6CTA 
AGAAATATTG 
TAAATTAAAA 
TCATTT6ACT 



60 
120 
180 
240 
300 
360 
420 
460 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1360 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3i80 
3240 
3300 
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TTGGAGGCAA AATGTGTTGA AGTGCCCTAT GAAGTA6CAA TTTTCTATA6 6AATATAGTT 3360 

GGAAATAAAT GTGTGTOTGT ATATTATTAT TAATCAATGC AATATTTAAA ATGAAATGAG 3420 

AACAAAQAGG AAAATGOTAA AAACTTGAAA TGAGGCTGGG GTATAGTTTG TCCTACAATA 3480 

GAAAAAAGAG AGAGCTTCCT AG6CCTGGGC TCTTAAATGC TGCATTATAA CTGAGTCTAT 3540 

GA6GAAATA0 TTCCTGTOCA ATTT6TGTAA TTTGTTTAAA ATTGTAAATA AATTAAACTT' 3600 

TTCTSGTTTC TQTGQGAAGG AAATAGOGAA TOCAATGGAA CAGTAGCTTT GCTTTX3CAGr 3660 

CTGTTTCSIAG ATTTCTGCAT CCACAAGTTA GTAGCAAACT GGGGAATACT OGCTGCAGCT 3720 

GGGQTTCCCT GCTTTTTQGT AGCAAGGGTC CAGAGATGAG GTGTTTTTTT OQG GGAG CTA 3780 

ATAACAAAAA CATTTTAAAA CTTACCTTTA CT6AAGTTAA ATCCTCTATT GCTQTTTCTA 3840 

TTCTCTCTTA TAGIGACCAA CATCTTTTTA ATTTAGATOC AAATAAOCAT GTCCTCCTA8 3900 

AGTTTAGAGG CTAGAGGGAG CTGAGGGGAO GATCTTACT6 AAAGCACCCT GGGQAOATIG 3960 

ATTGTCCTTA AAOCTAAGCC CCACAAACTT GACAOCTGAT CAGGTCTGGG AGCTACAaAA 4020 

TTTCATTTTT CTCCTCACTG CCCTTCTTCT GAGTGGCATT GGCCTGAATC AAGGAAAGCC 4080 

AGGCCTTGTG GGCCCCCTTC TTTCGGCTTT CTGCTAAAGC AACACCTCCA GCAGAGATTC 4140 

CCTTAAGTGA CTCCAOGTTT TCCACCATCC TTCA6CX3TGA ATTaATTTTT AATCAOTTTG 4200 

CrrrCTCCAO AGAAATTTTA AAATAATAGA AOAAATAGAA ATTTTGAATG TATAAAAGAA 4260 

AAAGATCAAG TTGTCATTTT AQAACAGAGG GAACTTTGGG AGAAAGCAGC CCAAGTAGGT 4320 

TATTTGTACA GTCAGAQGGC AACAGGAAGA TGCAGGCCTT C/iAGQGCAAG GAGAGGCCAC 4380 

AAG6AATATG GGTGGGAGTA AAAGCAACAT OOTCTGCTTC ATACTTTTTC CTA6GCTTG6 4440 

CACTGOCTTT TCCTTTCTCA GGCXSRATGGC AACTGCCATT TGAGTCOGGT GAGGGATCAG 4S00 

CCAACCTCTT CTCTATGGCT CACCTTATTT GGAGTGAGAA ATCAAGGAGA CAGAGCTGAC 4560 

TGCATGATGA GTCTGAAGGC ATTTGCAGGA TGAGCCTGAA CTCGrPGTGC AGAACAAACA 4620 

AGGCATTCAT GGGAATTGTT GTATTCCTTC TGCAGCCCTC CTTCTGGGCA CTAAQAAGGT 4680 

CTATGAATTA AATGCCTATC TAAAATTCTG ATTTATTCCT ACATTTTCTO TT TTCTAATT 4740 

IGACCCTAAA ATCTATGTGT TTTAGACTTA GACTTTTTAT TOCCCCCCCX: CCCTTTTTTT 4800 

TTGAGAOGGA GTCTOGCTCT GACGCACAGG CT6GA0TGCA GTGGCTCCGA TCTCTGCTCA 4860 

CTGAAAGCTC CGCCTCCOGG GTTCATGCCA TTCTCCTGCC TCAGCCTCCT GAGTAGCTGG 4920 

GACTACAGGC GCCCACCACC ACGCCCGGCT AATTTTTTGT ATTTTTAATA GAGACGGGGT 4980 

TTCACTGTGT TAGCCAGGAT GGTCTCQATC TCCTGACCTC GTGATCCGCC TGCCTTCGGCC 5040 

TCCCAAAGTG CTGGQATTAC AGGCATQACC CACGSCTCCC G6CCTT8TTT TCC8TTXAAA 5100 

GTCGTCTTCT TTTAATGTAA TCATTTTGAA CAT6TGTGAA AOTTGATCAT A06AATTGGA 5160 

TCAATCTTGA AATACTCAAC CAAAAGACAG TCGAGAAGCC AGGGGGAGAA AGAACTCAGG 5220 

GCACAAAATA TTGGTCTQAG AATGGAATTC TCTGTAAGCC TAGTTGCTGA AATT TCCTGC 5280 

TGTAACCAGA AGCCAGTTTT ATCTAA06GC TACTGAAACA CCCACTGTGT TTTGCTCACT 5340 

CCCACTCACC OATCAAAACC T6CTACCTCC CCSMGACTTT ACTAGT6CGB ATAA ACTTTC 5400 

TCAAAGAGCA ACCAGTATCA C l 'l VCC l WI C TATAAAACCT CTAACCATCT CTTTGTTCTT 5460 

TGAACATGCT GAAAACCACC TGGTCTGCAT GTATGCC06A ATTTGTAATT CTTTTCTCTC 5520 

AAATGAAAAT TTAATTTTAG GGATTCATTT CTATATTTTC ACATATGTAG TATTATTATT 5580 

TCCTTATATG TGTAAGGTGA AATTTATGGT ATTTOAGTQT GCAAQAAAAT ATATTTTTAA 5640 

A6CTTTCATT TTFCCCCCAG TGAATOATTT AOAATTTTTT ATGTAAATAT ACAGAATGTT 5700 

TTTTCTTACT TTTATAAGGA AGCAGCTGTC TAAAAT6CA0 TGGGGTTTGT TTTGCAATGT 5760 

TTTAAACA6A GTTTTAOTAT TGCTATTAAA AOAAGTTACT TTGCTTTTAA AGAAACTTGG 5820 

CTGCTTAAAA TAAGCAAAAA TTGGATOCAT AAAGTAATAT TTAC3W3ATGT GGGGAGAT6T 5880 

AATAAAACAA TATTAACTTG GCTGCTTAAA ATAAGCAAAA ATTGGATGCA TAAAGTAATA 5940 

TTTACAQATO TGQGSAtaATO TAATAAAACA ATATTAACTT GGTTTCTTGT TTTTGCTGTA 6000 

TTTAGAQATT AAATAATTCT AAGATGATCA CTTTGCAAAA TTATGCTTAT GGCTGGCATG 6060 

GAAATAGAAA TACTCAATTA TGTCTTTGTT GTATTAATGG G6AATATTTT GGACAATGTT 6X20 

TCATTATCAA ATTGTCGACA TCATTAATAT ATATTGTAAT GTTGGGAAGA GATCACTATT 6180 

TT6AAGCACA GCTTTACAGA TGAGTATCTA TGATACATAT GTATAATAAA TTTTOATCGG 6240 

GTATTAAAAG TATTAGAAGG T6GTTATAAT TGCAGAGTAT TCCAT6AATA GTACACTGAC 6300 

ACAGG6GTTT TACTTTGAGG ACCA6TGTAG TCAAGGGAAA ACATGAGTTA AAAAGAAAAG 6360 

CAGGCAATAT TGCAGTCTTG ATTCTGCCAC TTACAGGATA GATAATGCCT GAACTTTAAT 6420 

GACAAGATGA TCCAACCATA AAGGTGCTCT GTGCTTCACA GT6AATCTTT TCCQCAT6CA 6480 

GGAGTGTGCT CCCCTACAAA CQTTAAQACT 6ATCATTTCA AAAATCTATT AGCTATATCA 6540 

AAAGCCTTAC ATTTTAATAT AGGTTGAACC AAAATTTCAA TTCCAGTAAC TTCTATTGTA 6600 

ACCATTATTT TTOTGTATGT CTTCAAOAAT GTTCATTGGA TTTTTGTTTG TAATAGTAAA 6660 

ATACCGGATA CATTTCACGT GTCCTTCAGT ATT6ATTTGG TTGAATATTG GGTCATAAT6 6720 

OTTGAQAAGC ATGGACACTA GAGCCAQAAT GCTTGGATAT GAATCCTGGA TCTOTCRCTT 6780 

ACTTCTGTQT GACXTTTTGAA AGGCTACTTA TTTCCTCTCT TAGCTTTCTC ATTAAAATCA 6840 

ATGAACAATG CCAGCCTCAT GGGGTTGTTG AATGATTAAA TTAGTTAATA TACCTAAAGT 6900 

ACATAQAACA CTGCCTGCAC ATAGTAAAAG AATTATAAGT GT6AGQTAGT TGGTAAAATT 6960 

ATGTAOrTGG ATATACTACC GAACAATATC TAATCTCTTT TTAGGGAAAT AAAGTTTGTG 7020 
CATAtATATA ATCCCGAAAC ATG 

Seq ID NO: 34 Protein sequences 
protein Accession #t NP_077741.1 

1 11 21 31 41 51 

I i I I I 1 

MAAAGPRRSV RGAVCLHLLL TLVIFSRDGE ACKKVlIjNVP SKLBADKIIG RVNLEECPRS 60 

ADLIRSSDPD FRVLNDGSVY TARAVALSDK KRSFTI»?LSD KRKQTQKEVT VLIiBHQKKVS 120 

KTRBTRETVL RRAKRRHAPI PCSMQENSIiG PFPIiFLQOVB SDAAQNYTVF YSZSGRGVDK 180 

EPUniFYISR DTGNLFCTRP VHREBVDVFD LIAYASTADG YSADLPLPLP IRVEDENDNH 240 

PVPTEAIYNF EVIiBSSRPGT TVGWCATDR DEPDTMHTRL KYSILQQTPR SPGLFSVHPS 300 

TGVITTVSHY LDREWDKYS LIHKVQDMDG QPPGLIGTST CIITVTDaiD NAPTFRQNAY 360 

EAPVKENAPN VEILRIPIED KDLINTANWR VNPTILKGHE N6HFKISTDK ETNEGVLSW 420 

KPLNYEENRQ VNIiEIGVNNE APFAROIPRV TALNKALVTV BVRDLDE6PS CTFAAQWRZ 480 

KBHLAVGSKI NGYKAYDPEIT RNGISlGLRYXX LHDPKSHITI DEZSGSXITS KZLDREVBTP 540 

KNEIiYNITVL AZDRDDRSCT GTLAVNZBDV NDKPPEILQE YWXCKPKMG YTDZIAVDPO 600 

EPVHGAPFYP SliPNTSPEIS RUJSLTKVND TAARLSYQKN AGFQEYTIPI TVKDRAGQAA 660 

TKLLRVNLCE CTHPTQCRAT SRSTGVILGK WAILAIUiGI ALLFSVLLTIj VOGVPGATKQ 720 

KRFPEDLAQQ HLIISNTEAP GDDRVCSAMG FMTQnNNSS QGPOGTMGSG KKKGGQETIE 780 
MMRXSGNQTLE SCRGAGBBBT LDSOtGSITE VDNCRYTYfiE NHSFTgPRLG EESZROITO 

Seq ID NO: 35 DHA sequence 

Nucleic Acid Accession #: Eos sequence 

Coding sequence: 146-1273- 
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1 11 21 31 41 51 

I I I 1 I I 

GGQAGT6GGC GTGGCGGTGC TGCCCAGGTG AGCCACCGCT 6CTTCTGCCC AGACAOGGTC 60 

GCXrrCCACAT CCAGGTCTTT GTGCTCCTCG CTTGCCTGTT CXrrTTTCCACJ GCATTTTCCA 120 

GGATAACTGT GACTCCAGGC CCGCAATGGA TGCCCTGCAA CTAGCAAATT CGGCTTTTGC 180 

GGTTGATCTG TTCAAACAAC TAT6TGAAAA GGAGCCACTO QOCAATGTCC TCTTCTCTCC 240 

AATCTGTCTC TCCACCTCTC TGTCACTTGC TCAACTGGGT GCTAAAGOTG AGACTGCAAA 300 

TGAAATTGGA CAGGTTCTTC ATTTTGAAAA TGTCAAAGAT ATACCCTTTG GATTTCAAAC 360 

AGTAACATCG GATGTAAACA AACTTAGTTC CTTTTACTCA CTGAAACTAA TCAAGCGGCT 420 

CTACGTAGAC AAATCTCTGA ATCTTTCTAC AGAGTTCATC AGCTCTACX5A AGAC5ACCCTA 480 

TGCAAAGQAA TTQGAAACTO TTGACTTCAA AQATAAATTG QAAGAAACGA AAGGTCAGAT 540 

CAACAACTCA ATTAAG6ATC TCACA6ATG0 CCACTTTQAQ AACATTTTAG CTGACAACAO 600 

TGTGAACGAC CAGACCAAAA TCCTTGTGGT TAATGCTGCC TACTTTGTTG GCAAGTGGAT 660 

GAAGAAATTT CCTGAATCAG AAACAAAAGA ATGTCCTTTC AGACTCAACA AGACAGACAC 720 

CAAACCAGTG CAGATGAT6A ACATGGAGGC CACXSTTCTGT ATGGGAAACA TTGACAGTAT 780 

CAATT6TAAG ATCSATAGAGC TTCCTTTTCA AAATAAGCAT CTCAGCftTGT TCATCCTACT B40 

AGC3C3U^GGAT GTGGAG6ATG A6TCCACAGG CTTGGA6AAG ArfGAAAAAC AACTCAACTC 900 

AGAGTCACT6 TCACAGTGGA CTAATCCCAG CACCAT6GGC AAT6CCAAG6 TCAAACTCTC 960 

CATTCCAAAA TTTAAGGTGG AAAAGATGAT TGATCCCAAG GCTTGTCTGG AAAATCTAGQ 1020 

GCTGAAACAT ATCTTCAGTG AAGACSiCATC TGATTTCTCT G6AATGTCAG AGACCRAGGQ 1080 

ASTG6CCCIA TCAAATGTTA TCCACAAAGT 6T6CTTAGAA ATAACTGAAO ATGGTGGGGA 1140 

TTCCATAGA6 GT6CCAGGAG CACGGATOCT 6CAGCACAA6 GATOAATTGA ATGCTGACCA 1200 

TCCCTTTATT TACATCATCA GGCACAACAA AACTCJGAAAC ATCATTTTCT TTGGCAAATT 1260 

CTGTTCTCCT TAAGTGGCAT AGCCCATGTT AA6TCCTCCC TGACTTTTCT GTGGATGCCG 1320 

ATTTCTGTAA ACTCTGCATC CAGA6ATTCA TTTTCTAGAT ACAATAAATT GCTAATGTTG 1380 

CTGOATCAOG AAQCOGCCAG TACTTGTCAT ATGTAGCCTT CACACAGATA GACCTTTTTT 1440 

TTTTTCCAAT TCTATCTTTT GTTTCCTTTT TTCCCATAAG ACAATGACAT ACGCTTTTAA 1500 

TGAAAAGQAA TCAOGTTAGA GGAAAAATAT TTATTCATTA TTTGTCAAAT TGTCCGGGGT 1560 

AGTTGGCAGA AATACAGTCT TCCACAAAGA AAATTCXnTAT AAGGAAGATT TGGAAGCTCT 1620 

TCTTCCCAOC ACTATGCTTT CCTTCTTTGG GATAGAGAAT GTTCCAGACA TTCTCGCTTC 1680 

GCTGAAAGAC TQAAGAAAOT GTAGTGCATG GGACCCACGA AACTQCCCTG GCTCCAGTGA 1740 

AACTTGG6CA CATGCTCAG6 CTACTATAGG TCCAGAAGTC CTTATGTTAA GCCCTGGCAG 1800 

GCAGQTGTTT ATTAAAATTC TGAATTTTGG GGATTTTCAA AAGATAATAT TTTACATACA 1860 

CTGTATGTTA TAGAACTTCA TGGATCAGAT CTGGGGCAGC AACCTATAAA TCAACACCTT 1920 

AATATGCTGC AAC3UVAATGT AGAATATTCA GACAAAATGG ATACATAAAG ACTAAGTAGC 1980 

CCATAAGGGG TCAAAATTTG CTGCCAAATG CGTATGCCAC C3UVCTTACAA AAACACTTCG 2040 

TTOSCAGAGC TTTTCAGATT GTGGAATGTT GQATAAGGAA TTATAGACCT CTAGTAGCTG 2100 

AAATGC3UW3A CCCCAAGAGG AAGTTCAGAT CTTAATATAA ATTCACTTTC ATTTTTGATA 2160 

GCTGTCCCAT CTGGTCATGT GGTTGGCaCT AGACTGGTGG CSkOGGGCTTC TAGCTQACTC 2220 

GCACAGGGAT TCTCACAATA GCCGATATCA QAATTTGTGT TGAA66AACT TGTCTCTTCA 2280 

TCTAATATGA TAGOGQGAAA AGGAGASGAA ACTACTGCCT TTAGAAAATA TTVAGTA AAGT 2340 

QATTAAAGTG CTCACGTTAC CTTGACACAT AGTTTTTCAG TCTAT6GGTT TAGTTACTTT 2400 

AOATGGCAAG CATGTAACTT ATATTAATAG TAATTT6TAA AGTTG6GTG6 ATAA GCTATC 2460 

CCTGTTGCCG GTTCATGGAT TACTTCTCTA TAAAAAATAT ATATTTACCA AAAAATTTT6 2520 

TGACATTCCT TCTCCCATCT CTTCCTTGAC ATGCRTTQTA AATAGGTTCT TCTTGTTCTG 2580 
AGATTCAATA TTGAATTTCT CCTATGCTAT TQACAATAAA ATATTATTOA ACTACC 

Seq ID NO? 36 Protein sequence: 
Protein Accession #t NP_002630.l 

1 11 21 31 41 51 

I i i i i i 

KDALQIiANSA FAVDLFKQLC EKEPLQIVLF SPICL5TSLS LAQVGAKGDT ANEIGQVIiHF 60 

BNVKDIPFGF QTVTSDVNKL SSFYSIiKLIR RIiYVDICSLNL STBFISSTKR PYAKELETVD 120 

FKDKLEBTK6 QIKKSIKDLT SGHFEtlZLAD NSVNDQTKIL WMAAYFVGK WMKKFPESET 180 

KECPPRLNKT DTKPVQMMNM EATPCMGNID SINCKIIELP FQNKHLSMFI LLPKDVEDES 240 

TGLEKIEKQL NSESLSQWTN PSTMANAKVK LSIPKPKVEK MIDPKACLEN LGLKHIFSED 300 

TSDFS6MSET KGVALSNVIH KVCLEITEDG GDSIEVPGAR ILQHXDELtlA DHPFIYIIRH 360 
NKTRSrilFFG KFCSP 



Seq ID NO: 37 DNA sequence 

Nucleic Acid Accession tf: NM_016B5e3 

Coding sequence: 72-842 

1 11 21 31 41 51 

I I I i I I 

G6AGTGGGGG A6AGAGAGGA GACCAGGACA GCTGCTGAGA CCTCTAAGAA 6TCCAGATAC 60 

TAAGAGCAAA GATGTTTCAA ACTGGG6GCC TCATT6TCTT CTACGGGCTG TTAGCCCA6A 120 

CCATGGOCCA GTTT6GA0GC CTGOCOSPGC CCCTGGACCA GACCCTGCCC TTGAATGTGA 180 

ATCXAGCCCr GCCCTTGAGT CCCACAGGTC TTGCAGGAAG CTTGACAAAT GGCCTCAGCA 240 

ATGGCCTGCT GTCTGGGGGC CTGTTGGGCA TTCTGGAAAA CCTTCCGCTC CTGGACATCC 300 

TGAAGCCTGG AGGAGGTACT TCTGGTGGCC TCCTTGGGGG ACTGCTTGGA AAAGTGAOGT 360 

CAGTGATTCC TGGCCTGAAC AACATCATTG ACATAAAGGT CACTGACCCC CAGCTGCTGG 420 

AACTTGGCCT TGTGCAQAGC CCTGATGGCC ACCGTCTCTA TGTCACCATC CCTCTCGGCA 480 

TAAAGCTCCA AGTGAATAOQ CCCCTGGTOG GTGCAA6TCT GTTGAGGCTG GCTGTGAAQC 540 

TGGACATCAC TGCAGAAATC TTAGCTGTGA GAGATAAGCA 6GAQAGGATC CACCTGGTCC 600 

TTGGTQACTG CACCCATTCC CCTGGAAGCC TGCAAATTTC TCTOCTTGAT GQACTTG6CC 660 

CCCTCCCCAT TCAAGGTCTT CTGGACAGCC TCACAGQGAT CTTGAATAAA GTCCTGCCTG 720 

AGTTGGTTCA GGGCAACGTG TGCCCTCTGG TCAATGAGGT TCTCAGAGGC TTGGACATCA 780 

CCCTGGTGCA TGACATTGTT AACATGCTGA TCCACGGACT ACA6TTTGTC ATCAAGGTCT 840 

AAGCCTTCCA GGaagGGGCT GGCCTCTGCT QAGCTGCTTC CCAGTGCTCA CAGATGGCTG 900 

GCCCATGTGC TGGAAGATGA CACAGTTOCC TTCTCTOOSA GGAACCTGCC CCCTC TCCTT 960 

TCCCACCAGO OGTGTGTAAC ATOOCATGTG CCTCACCTAA TAAAATGOCT CTTCTTCTGC 1020 
AAAAAAAAAA AAAAAAAAAA AAAAAAAAA 
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Seq ID NO: 38 Protein sequence x 
Protein Acceseion ft: NP_057667 

1 11 21 31 

j I i 1 

MFQTGGLIVF YGIiIAQTMAQ FGGLPVPLDQ TLPIiNVNPAI* 
86GLIX3ILBN IiPIiLDZLKPG GGTSGGLLGG LLGKVTSVIP 
VQSFOGHRLY VTIPLOXKLQ VNTPLVGASL LRLAVKLDIT 
THSP6SLQIS UiDGLOPLPI Q6LLDSLTGI LNKVLPELVQ 
DZVNMLZRGL QFVIKV 



.Seq ID KOt 39 DNA sequence 
Hucleic Acid Accession #i NM_004363.1 
Coding sequence i 115-2223 

X 11 21 31 41 51 

I I I ) I i 

CTCAG66CAG AGGGAGGAAG GACAGCA6AC CAGACAOTCA CAGCAGCCTT GACAAAACGT GO 

TCCTtSGAACr CAAGCTCTTC TCCACAGAGG AGGACAOAGC AOACAGCAOA GACCATGQAa 120 

TCTCCCTCGG CCCCTCCCCA CAGATGGTGC ATCCCCT6GC AQAGGCTCCT GCTCACAGCC 180 

TCACTTCTAA CCTTCTGGAA CCCGCCCACC ACTQOCAAGC TCACTATTGA ATCC3VCGCCG 240 

TTCAATGTCX5 CAGAGGGGAA GGAGGTGCTT CTACTTGTCC ACAATCTGCC CCAGCATCTT 300 

TTTGGCTACA GCTGGTACAA AGGTGAAAGA GTGGATGGCA ACOBTCAAAT TATAGGATAT 360 

GTAATAGGAA CTCAACAAGC TACCCCAGGG CCGQCATACA GTGGTOGAGA 6ATAATATAC 420 

CCCAATGCAT CCCTGCTOAT CCAGAACATC ATCCAGAATG ACACAGGATT CTACACCCTA 480 

CACXrrCATAA AGTCAGATCT TGTQAATGAA GAAGCAACTG GCCAGTTCCG GGTATACCOG 540 

GAGCTGCCCA AQCCCTCCAT CTCCAGCAAC AACTCCAAAC CCGTGGAGGA CAAGGATGCT 600 

GTGGCCTTCA CCTOTGAACC TGAGACTCAG GACGCAACCT ACCTGT6GTG GGTAAACAAT 660 

CAGAGCCTCC G83TCAGT0C CAGGCIGCA6 CTGTCCAATO GCAACAOQAC CCTGACTCTA 720 

TTCAATSTCS^ CAAGAAATGA CACAGCAAGC TACAAATOTG AAAC3CCAGAA CCCAGTCAGT 780 

GCCAGGCGCA GTGATTC3W3T CATCCTGAAT GTCCTCTATG GCCCGGATGC CCXXACCATI 840 

TCCCCTCTAA ACACATCTTA CAGATCAGGG GAAAATCTQA ACCTCTCCTG CCAOGCAGCC 900 

TCTAACCCaC CTGCACAGTA CTCTTGGTTT GTCAATGGQA CTTTCCAGCA ATCC3VCCX31A 960 

GAOCTCTTTA TCOCCAACAT CACTGTOAAT AATAGTGGAT CCTATACGTG CCAA8CGCAT 1020 

AACTCA6ACA CTGGCCTCAA TAGQAGCACA 6TCA08ACGA TCACRGTCTA T6CAGAGCCA 1080 

CCC3VAACCCT TCATCACCAG CAACAACTOC AACCCOGTGG AGGATGAGGA TCCTQTAGCC 1140 

TTAACCTGTG AACCTGAGAT TCAGAACACA ACCTACCTGT GGTGGGTAAA TAATCAGAGC 1200 

CTCCCGGTCA GTOCCAGGCT GCASCTGTCC AATGAC3«CA GGACCCTCAC TCTACTCA3T 1260 

GTCACAAGQA ATGATGTAG6 AOCCTATGAG TGTGGAATCC A6AA0GAATT AAGT6TT6AC 1320 

CACAGOGACX: CAGTCATCCT GAATGTCCTC TATGGCCCRG AOGACCCCAC CATTTCCCCC 1380 

TCATACACCT ATTACGGTCC AGGGGTGAAC CTCA6CCTCT CCTGCCATQC AGCCTCTAAC 1440 

CCACCTGCAC AGTATTCTTG GCTGATTGAT G0QAACAT(3C A6CAACACAC ACAAGAGCTC 1500 

TTTATCTCCA ACATCACTGA 6AAGAACAGC GGACTCTATA CCTGCCAGQC CAATAACTCA 1560 

GCCAGT660C ACA6CAGGAC TACAGTCAAG ACAATCACAO TCTCTGOGGA 6CTGCCCAA0 1620 

CCCTCCATCT OCAGCAACAA CTCCAAACCC GTQGAGGACA AGGATGCTGT GGCCTTCACC 1680 

TGTGAACCTG AGGCTCAGAA CACAACCTAC CTOTGGTGaG TAAATGGTCA GAGCCTCCXa 1740 

GTCAGTCrCA GQCTGCAGCT GTCCAATGGC AACAGGACCC TCACTCTATT CAATGTCACA 1800 

AGAAATGAG6 CAAGAGCCTA TGTATGTGGA ATCCAGAACT CAGTGAGTGC AAACCGCAGT 1860 

GAOCCAOTCA 0CCTGGAT6T CCTCXATGGG OOGOACACCC CCATCATTTC CCCCCCAGAC 1920 

TOGTCTTAOC TTT0GG6AGC GAACCTCAAC CTCTCCTGCC ACTC3GG0CTC TAACCCATCC 1980 

CCGCAGTATT CTTGGOGTAT CAATGGGATA CC3GCAGCAAC ACACACAAGT TCTCTTTATC 2040 

GCCAAAATCA CX3CCAAATAA TAACGGGACC TATQCCTGTT TTGTCTCTAA CTTGGCTACT 2100 

GGCC3GCAATA ATTCCATAGT CAAGAiGCATC ACAGTCTCTG CATCTGGAAC TTCTCCTGGT 2160 

CTCTCAGCTG GGGCCACTGT OSGCATCATG ATTGGAGTGC TGGT TCGGGT TGCTCTQATA 2220 

TA6CA6CCCT GGTGTAOTTT CTTCATTTCA GGAAGACTGA CAGTTGTTTT GCTTCTTCCT 2280 

TAAAGCATTT GCAACAGCTA CAQTCTAAAA TTGCTTCTTT ACCAAGGATA TTTACAQAAA 2340 

AGACTCTGAC CAGAQATCGA 6ACCATCCTA GCCAACATOG TGAAACCCCA TCTCTACTAA 2400 

AAATACAAAA ATGAGCTGGG CTTGGTGGC36 OGCACCTGTA GTCCCAGTTA CTCGQQAGGC 2460 

TGAGGCAOGA GAATOSCTTG AACGCGGGAG 6T6GAGATTG CAGTGAGCCC AGATOGCACC 2520 

ACTGCACTCC AGTCTGGCAA CAGAGCAAGA CTCCATCTCA AAAAGAAAAG AAAAGAAGAC 2580 

TCTGAOCTQT ACTCTTGAAT ACAAGTTTCT GATACCACTG CACTGTCTGA GAATTTCCAA 2640 

AACTTTAATG AACTAACT6A CAGCTTCATG AAACTOTCCA CCAA6ATCAA 6CAGAGAAAA 2700 

TAATTAATTT CATGGGACTA AATGAACTAA TGAGGATTGC T6ATTCTTTA AATGTCTTGT 2760 

TTCCCAGATT TCAQGAAACT ' m * m ' Ci ' TT TAAGCTATCC ACTCTTACAG CAATTTGATA 2820 

AAATATACTT TTGTGAACAA AAATTGAGAC ATTTACATTT TCTCCCTATG TG6TCGCTCC 2880 

AGACTTGGGA AACTATTCAT GAATATTTAT ATTGTATGGT AATATAGTTA TTGCAGAAGT 2940 
TCAATAAAAA TCTQCTCTTT GTAIAACAOA AAAA 

Seq ID NOt 40 Protein sequence: 
Protein Accession #t NP_0 043 54.1 

1 11 21 31 41 51 

I I I I I I 

MESPSAPPHR WCIPWQRUiL TASIiLTFWNP PTTAKLTIBS TPFNVAEGKE VLLLVHNLPQ 60 

HLPGYSWYKG BRVDGNRQII GYVIGTQQAT PGPAYSGREI lYPNASLLIQ NIIQNDTGFY 120 

TliHVIKSDLV NEEATGQFRV VPELPKPSIS SNNSKPVEDK DAVAFTCEPB TQDATYLMWV 180 

NNQSLPVSPR LQLSNGSIRTL TX«FNVTIINDT ASYKCBTOHF VSARRSDSVI U9VLY6PDAP 240 

TISPIiNTSYR SGENUILSCH AASNPPAQYS HFVNGTFQQS TQELPIPWIT VNtTSGSYTOQ 300 

AHNSDTGLNR TTVTTITVYA EPPKPPITSM NSNPVBDEDA VALTCEPEIQ NTTYLWWVNN 360 

QSLPVSPRIiQ LSNDKRTLTL LSVTRNDVGP YECGIQNELS VDHSDPVILN VLYGPDDPTI 420 

SPSYTYYSPG VNLSIiSCHAA SKPPAQYSWL IDdTZQQHTQ ELFISNXTEK NS6LYTCQAN 480 

NSASGHSRTT VKTITVSAEL FKPSISSNNS KFVBDKDAVA FTGEPEAQNT TYLHHVNGQS 540 

LPVSPRLQItS mSIRTliTLFll VTRMDARAYV GQIQH6VSAN RSDPVTUlVXi Y6PDTFZISP 600 

PDSSYLSGAN LNLSCHSASN F8PQYSNRIH GIPQQHTQfVL PIAXITPNNN 6TYACPVSNL 660 
ATGBNKSIVX SITVSASGTS PGLSAGATVG IMIGVLVGVA LI 



41 51 
I I 

PLSPTGLAGS LTNALSNGLL 60 

GLNKIIDIKV TDPQIiLELGL 120 

AEILAVRDKQ GRIHLVLGDC 180 

GNVCPIjVNBV LR6LDITLVH 240 



204 



wo 02/086443 



PCT/US02/12476 



5 

10 

15 

20 

25 

30 

35 

40 

45 

50 

55 

60 

65 

70 

75 

80 

85 



Seg ID NO: 41 DNA sequence 

Nucleic Acid Accession NM_006952.1 

Coding sequence: 11-793 



AATCXX5SACA 
TGGAAATGTG 
ATCTGACCAA 
GGCTQCCTGG 
TGTAGGCATC 
AGTATATGCC 
ACCCAACCTC 
TGATGACCAG 
CAATTGCTGT 
TGAGAATAAT 
AGAACCrCTC 
CTGCTATGAA 
ATTTGCCATT 
AATTGAATAT 



11 

I 

ATGGCGAAA6 
ATTATTGQTT 
CACAGCCTCT 
ATCGGCATAT 
ATGAAGTCCA 
TTTGAAGTGG 
rrCCTQAAGC 
TGGAAAAACA 
GGCGTAAATG 
GATGCTGACT 
AACCTGGAGG 
CTGATCTCTG 
CTCTGCTGGA 
TAAGAA 



21 
I 

ACAACTCAAC 
GTTGCGGCAT 
ACCCACTGCT 
TTGTGGGCAT 
GCAG6AAAAT 
CATCTTGTAT 
AGATGCTAGA 
ATGGAGTCAC 
GTCXa^TCAGA 
ATCCCTGGCC 
CTTGTAAACT 
GTCCAAT6AA 
CTTTTT6GGT 



31 . 
I 

TQTTCGTTQC 
TGCX:CTGACT 
T6AAGCCACC 
CTGCCTCTTC 
TCTTCIGGGS 
CACAGCAGCA 
GAGGTACCAA 
CAAAACCTGG 
CTGGCAAAAA 
TOGTCAATGC 
AG6CGTGCCT 
C0GACA08CC 
TCTCCTGGGT 



41 
I 

TTCCAGGGCC 
GCG6AGTGCA 
GACAACGATG 
TGCCTGTCTG 
TATTTCATTC 
ACACAAC6A6 
AACAACA6CC 
GACAGGCTCA 
TACACATCTG 
T GTGTTAT GA 
GGTTTTTATC 
TGGGGOGTTO 
ACCATGTTCT 



51 
I 

TGCTGATTTT 
TCTTCTTTGT 
ACATCTATGG 
TTCTAGGCAT 
TGATGTTTAT 
ACTTTTTCAC 
CTCCAAACAA 
TGCTCCAGGA 
CCTTCCQGAC 
ACAATCTTAA 
ACAATCAGGG 
CCTGOTTTCSO 
ACTGQAGCAO 



Seq ID NO: 42 Protein sequence t 
Protein Accession #: NP_0088fl3.1 

11 



51 



1 11 21 31 41 ^ 

ilAKDNSTVRC FOGLLIFGNV 116CCGIALT ABCIPFVSDQ HSI1YPI1I.BAT DNDDIYGAAW 
IGIPVGICLP CLSVLGIVGI MKSSRKILLA YPILMFIVYA FEVASCITAA TQRDFFTPNIi 
PWCQMLERYQ NNSPPNNDDQ WKNNGVTKTW DRLMLQDNCC GVNGPSDWQK YTSAFRTEKN 
DADYPWPBQC CVMNNLKBPI* NLEACKLGVP GPYHNQGCYE IiISGPMNRHA WGVAWFGPAI 
LCWTFWVLLG TMFYWSRIEY 



8eg ID NOt 43 DNA sequence 

Nucleic Acid Accession #: Eos sequence 

Coding sequence: B3-2605 



1 
1 

6CCGGACA6A 
ACCTCT6TCC 
AAGATTTCAA 
AAGA6AACAC 
ACAGTTTTTG 
ACCATATAAA 
T6AGAAGATT 
TGAAATAGAA 
AGTAACTAAC 
GGCTTGCATQ 
TGAQTTACAA 
TATTCATGCA 
TTACTATGGA 
TCTTTGCACC 
TCCAGATGGA 
ATTTACTGCT 
CCAGGAATTG 
T6AGCTTGTT 
AATTGTCAAA 
CCTTTT6TAT 
TGAGGATGGG 
CCAAGAGATT 
CATTTTTGGT 
GAAATAC6CA 
TGGAGATOCA 
TGGOGTGTAT 
AGATAGTTCC 
TATTTGTGGA 
CATGGAGCAG 
AACTTCCATT 
TTCTGAGAAT 
GTTAGATACT 
A6CTGGAAAG 
AAATACTTCC 
TGCTGQAGAA 
TCGGCAiSTAT 
CCTTGAGCTC 
GGAATCTTTG 
CAAAGAAGAC 
TGATGAATTT 
GTCAACAGCG 
TATATTTCAA 
TTTTGAAAAT 
AGTTTACCAG 
TGCA6ATTAA 
ACACACACAC 
T6ATGTCCCA 



11 
I 

TCTGCGOGTA 
CAAGCAAGAG 
AGCTGGAAAA 
AGACCTGATC 
CTTTCAACAA 
GGCT6GAAGC 
C3VAGCATTTG 
AGAAAGGGAA 
TTQATACCAG 
GGTTTGGCAA 
GCCCAG6AAG 
AGGGTGTACA 
AAATACATTG 
AAGATGGCTT 
AAATACAGTC 
CTCOGCAGCT 
ATGTCTGATG 
CATOATCTTG 
6TCTCAAATG 
ATTGAAGCAA 
T6TAAGCATG 
CAAGCTGAAG 
CATGAACTTG 
GAT6ACAAAA 
OGCCTAGSAA 
GTTT6TGGTA 
TCTGGAGATT 
ATCGATGAAT 
CAAAGTATTA 
ATTGCTGGTG 
TTAAAAATGG 
CCAAATGAGC 
CAGAGAACCA 
GTACTTQAAG 
ACAATAGATC 
GTGTACCCAA 
CGGAAACAGA 
ATTCGTCTGA 
GCTGAGGATA 
GGGAACCTAG 
AAAAGATTTA 
TTTCATCAAC 
TTTATTG6AT 
CTTCAAACTA 
AGCCATCTCA 
ACACACACAC 
AAA6TATTAT 



21 



TCCTGGAGCC 
A6ATGAAT6G 
GGGGAAGAGG 
TGAGTAAAAC 
AGACCCCACA 
TTTATTTCTC 
AAAAATTTTT 
GTATTTTGGT 
ATATA6CAAC 
TACATCA6GT 
GATT6TCTAA 
ACTATGAGCC 
CTCTAAGAGG 
TTCTTTGTGC 
TTCCX3VCAAA 
CTCCTCTCAC 
ATCAGAGAGA 
TGGATAGCTQ 
C6GAAGAAGG 
ATTCTATTAG 
QAATGTTQAT 
AAAACCTGTT 
TTAAAGCAGG 
ACAGAATTCC 
AAAGTCAAAT 
AGACCAC3GAC 
TTGCTTTGGA 
TTGATAAGAT 
GTCTTGCTAA 
CAAATCCAGT 
GGAOTGCACT 
ATCATGATCA 
TTAGCAGTGC 
TA6TTTCTGA 
CCATTCCCCA 
G6CTATCCAC 
GCCAGAGGTT 
CAGAGGCAC6 
TAGTGGAAAT 
ATTTTGAGCQ 
TTTCTGCTCT 
TTCGGCAOAT 
CACTAAATGA 
T0TAAAA6GA 
GTQAAGATAT 
ACACACACAC 
AATAGGAAAA 



31 
I 

GGCCCAGTT6 
AGAGTATAGA 
TGGTGGGAAC 
CACAGGAAAA 
GTCAATGCAG 
TQAAGTTTAC 
CACAAGGCAT 
AGATTTTAAA 
TGAACTAAGA 
OTTAACTAAO 
TGATG6AGAA 
TTTGACACAQ 
GACA6TGGTT 
TGCATGTGGA 
GTGTCCTGTG 
AGTTACGAT6 
AGCAGGT06Q 
TGTCCOCSGGA 
TTCTG6AAAT 
TAATAGCAAA 
GGAGTTCTCA 
TAAACTCATT 
TTTGGCATTA 
AATTCGGGGA 
GCTACAGGCA 
CACCTCTGGT 
AGCTGGTGCC 
GGGGAATCAA 
GGCTGGTGTG 
TGGAGGACAT 
ACTATOCAGA 
CTTACTCTCT 
CACAGTAGCT 
GAAGCCATTA 
CCAGCTATTG 
AGAAGCTGCT 
AAATAGCTCA 
AGCAAGGTT6 
TATGAAATAT 
ATCCCAGCAT 
CAACAACGTT 
TGCCAAAGAA 
CCAG6GTTAC 
CTTCACCAAG 
6CGTGCACGC 
ACACACAGTC 
AAGCATTAAA 



41 
I 

TGAACTAGGA 
GGCAGAGGAT 
TTCTCAGGAA 
CGTACTTCTG 
TCAACATTGG 
AOGQATAOCT 
ATTGATTTGT 
GAACT6ACA6 
GATGCACCTG 
QAOCTTGAAA 
ACAATGGTAA 
CTCAAGAATG 
CGTGTCAGTA 
GAAATTCAGA 
CCTGTGTOTC 
6ACTGGCA6T 
ATTCCACGAA 
GACACAGTGA 
AAGAATGACA 
GGACAGAAAA 
CTTAAAGACC 
GTCAACTOGC 
GCACTCTTTG 
GACCCCCACA 
GCGTGCAATG 
CTGACGGTAA 
CTGGTACTTG 
CATCAAGCCr 
GTTTGTAGCC 
TACAATAAAG 
TTTGATTTGG 
GAACATGTGA 
CGTATGAATA 
TCA6AAAQAC 
AGAAAGTACA 
OGAGTTCTTC 
CCAATCACTA 
GAATTGAGAG 
AGCATSCTAG 
GGTTCTGGAA 
GCTGAAAGAA 
CTAAACATTC 
CTCTT6AAAA 
TTAGG6CCTC 
ACAGACAGAC 
AAATACTGTT 
TATAATAAAC 



51 



60 
120 
180 
240 
300. . 
360 
420 
480 
540 
600 
660 
720 
780 



60 
120 
180 
240 



GAGCTTTGpG 
TTGGAGGAGG 
AATGGAGAGA 
AACAAACCCC 
ATCQATTCAT 
CrCCTTTGAT 
ATGACAAGGA 
AAGGTGGTGA 
AGAAAACCTT 
6GCATGCAGC 
ATGHRSCCACA 
TCA6A6CAAA 
ATATAAAGCC 
GCTTTCCTCT 
OAGGCAOOTC 
CAATCAAAAT 
CAATAGAATG 
CTATTACTGG 
AGTGTATGTT 
CAAAGA6TTC 
TTTATGCCAT 
TTTGCCCTGT 
GAGGAAGCCA 
TCCTT6TTGT 
TTGCCCCACG 
CTCTTTCAAA 
GTGATCAAGG 
TCZTTGGAAGC 
TTCCT6CAA6 
CCAAAACAGT 
TCTTTATCCT 
TTGCAATAAG 
GTCAA6ATTC 
TAAAGOTQGT 
TTGGCTATGC 
AAGATTTTTA 
CCAGGCAQCT 
AGGAAGCAAC 
GAACTTACTC 
TGAGCAACA6 
CTTATAATAA 
AGGTTGCTGA 
AAGG CCCA AA 
CTGUGTTTAT 
AGACACACAC 
CTCTGAAAAA 
TAATTTAAGA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
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AGTGATAAAG TCTCCAGATG CAGTAGCTCA CACTGTAATC ACAGTGACTC AGQAGGCTOA 2880 

GOIGAGAGGA TTCCTT6AGG CCAGGGTTGG AGACCAACXn* TGGGCAACAT AGCSU^CCC 2940 

CATTTCTTAA AAAAAAAAAA AAAAAATTTA AACTTA6CTG GGTATGQTGG C ACaTG CCTA 3000 

TAGTCTCAGC TACTTGTGAG GCTGAGGCAG GAGGATTCTT TGAGCCCAGG AGTTTGAGGT 3060 

TACAGTGAGC CACAATCACA CCAATCACTG CACTCCAGCC TGGGCAATAA AGTAACTCTT 3120 

GACTCAAAAA AATAAAAAAA ATTGTAGTGG TAGCCATGTG TTAATTGTTA AATAAATTCT 3180 

OCAAAGGGCT AAAAGTAAAT TACTTATAAA TTTTTTATAfi TTGTATTTTT 6ACCTGCCTT 3240 

TTATATGTAT GAATATTTCA TAGTTTTGCA TATCAOATGT AGGCATACAG ACAAATACAT 3300 

AAACCAATGA ATATATTACA TATTCTGTOT TCCAATAAAA CTTTATTTAT GGAC ACTAAA 3360 

ATTTGAATTT CATAAAATTT TOCCATGTCA AGAATACAAA AZACTTYSAGT TTTCTTTTTA 3420 

GCTATTTAAT AATAGGTCTC ATTTATTCCA CAG6CTGTAG TTTGTAGTCT TGCTT6AAAC 3460 

AATAOAAACA GACTGATTAA GCAGGAGAAG TTTTTTGAAA GAATTTTGTT TGGCTOIOGQ 3540 

AATTATTAGA A6GCAGGTGA ACCAGGAGGG TAAGCTTCCA GCAGCAATTT GTAAAACCAT 3600 

GCCTTAGAAT T6GACTAAGG AAGAAGCTGC TGACACTCCA CTGCCACACA GGGCACTG6A 3660 

AGAAA6TGCT GCTGCCTCCC TGCCCCACCT TTGCCACTTC TQCAGCAGQA ATAGGTAOAA 3720 

GAATGCCCCC ACCOGCACCG GAACAGCAAC AAAAQQATTC TOCATQAOAT GCCTCCCTAA 3780 

ATTGCTGAAT TCAAAAAAGA AGTTGCATAC AAAGACATCT GATTGAAAAA GGGTATGTTA 3840 

TATGCCCCTT TCATAGQCTG CTAGGGAGTT TTCXrTGGTTC TACTTTCAGG TGGTGGGATC 3900 

AATAAGACCA GAATTTCTCA TATGTTGTGA GAQ6ATTCAA ATGTTACA6G GTTGCCAGCC 3960 

AAACTATCAA TCATGTATAA ATOCAACAAA CACmGTAA CATACAAGAA CTCAQGAAAT 4020 

GTGAACCATT GTTG6A6AAT CTACTAAAAT AGGGCTTCCC GCAAAOGAAG ATGAATGGAA 4080 

AATGTAAATA AAAAGAACTG GCAGTGTATA TCAGATGTTT AACTATAGGA CCAGAACTAA 4140 

GATGTGGAQA CTATTQCCAT ASACCACAAT GTAAATTTTT AAGTGAGGAA GGAAAAATCA 4200 

6GAATCAAAA G6GGCCAGGT GCAGTG6CTC ACATCTATAA TCCCAGAGCT TTGGQAGTTC 4260 

GAGGCA6GA6 GATCACTTQA AGOCAGTTTT GA6ACCAGCC TAT6CAACAC ATTGA6ACCC 4320 

TATCTCTACA AAAAATAOAT TA0CCGG6CA OGGIQGTGCA TGC CTAT T6T CCTACCtACT 4380 

GTGGAGGCTO AA6TAGGAAA TCACIT6A6C COGAGAGTTT GAGGTTACAO TQAGCTATQA 4440 
TTATACCACT 6CACTCCAGC CTGGGCAAGA 6AGCAAGACC TTGTCTCTT 

Seq ID NO: 44 Protein sequence i 
Protein Accession Ht CAB55276.3 

1 11 21 31 41 51 

i I I i I I 

NNGBYRORGF GR6RFQSWKR 6R(36(aiFSGK HREREHRTOL SKTT6KRTSE QTFQFIiLSTK 60 

TFQSMQSTLD RFIPYKOWKL YFS&VYSDSS FLZEKIQAFE KFFTRHIDLY DKDEIERKGS 120 

ILVDFKELTB GGEVTNIiIPD lATEIiRDAPB KTLACMGLAI BQVLTKDLER HAAELQAQBG 180 

LSNDQETMVN VPHIHARWN YEPLTQLKNV RANYYGKYIA LRGTWRVSN IKPLCTKMAP 240 

LCAAOQEIQS FPLPDGKYSL FTKCPVPVCR GRSFTAI1RS8 PLTVTMDWQS IRIQELMSDD 300 

QREAGRIPRT lECELVBDLV DSCVPGDTVT ITGIVXySNA EBBSRKKNDK CMFLLYZEAN 360- 

SISHSKGQKT KSSED6CKHG MLMBFSIiKDL YAIQEIQAEE NLFKLIVNSL CPVIFGBELV 430 

KAGLALALFG GSQKYAIIDKN RIPIRGDPHI LWGDPGLGK SQMLOAACNV APRGVYVCGN 480 

TTTTSGLTVT LSKDSSSGDF ALEAGALVLG DQGI06ZDEF DKhfQNQHQAL LEAMEQQSIS 540 

LAKAGWCSL PARTSIIAAA NPVGGHYNKA KTVSENLKNG SAIiLSRFDLV FILLDTPKBH 600 

HDHLLSEHVI AIRAGKQRTI SSATVARMNS QDSNTSVLBV VSBKPLSERL KWPGETIDP 660 

IPflQLliRKYI GYARQYVYPR ItSTBAARVLQ DPyiiBIiRRQS QRUISSPITT RQLBSLIRLT 720 

BARARLEUIB EATKBDAEDI VBIMKy814LG TYSSEFGHU} FBRSQE6SGM SNRSTAKRFI 780 
SAUniVAERT YNNIFQFBQL RQIAKEUnQ VADFENFIGS LNDOGYLIJCK GPKVYQLQTM 



Seq ZD KOt 4S DNA sequence 

Nucleic Acid Accession ft: NH_005416.1 

Ccxling sequence: 14 9.. 65 8 

1 11 21 31 41 51 

t I I I i I 

ACCAGATCCC AGAGGCTGAA CACCTOQACC TTCTCTGCAC AGCAGATGAT CX:CTGA6CAG 60 

CTGAAGACCA GAAAAGCCAC TAAGACTTTC TGCTTAATTC AGGAGCTTAG AGGATTCTTC 120 

AAAGAGTGTG TCCAOQATCC TTTQAAGCAT QAGTTCTTAC CAGCAGAAGC AGACCTTTAC 180 

CCCACCACCT CAGCTTCAAC AGCAGCAGGT GAAACAACCC AGCCAGCCTC CACCTCAGOA 240 

AATATTTGTT CCCACAACCA AGGAQCCATG CCACTCAAAG GTTCCACAAC CTGGAAACAC 300 

AAAGATTCCA GAGCCAGGCT GTACCAAGGT CCCTGAGCCA GGCTGTACCA AGGTCCCTOA 360 

GCCAGGCTGT ACCAAGGTCC CTGAGCCAGG TTQTACCAAG GTCCCTGAGC CAGGCIGTAC 420 

CAAGGTCCCT GAGCCAGGTT GTACCAAGGT CCCTGAGCCA GGCTACACCA AGGTCCCTOA 480 

ACCAGGCAGC ATCAAGGTCC CTGACCAAGG CTTCATCAAG TTTCCTGAGC CAGGTGCCAT 540 

CAAAGTTCCT GAGCAAGGAT ACACCAAA6T TCCTGTGCCA GGCTACACAA AGCTACCAGA 600 

GCCATGTCCT TCAAOGGTCA CTCCAGGCCC AGCTCAGCAG AAGACCAAGC A6AAGTAATT 660 

TGGTGCACAG ACAAGCCCTT GAGAAGCCAA CCACCAGAT6 CT8GACACCC TCTTGCCATC 720 

TGTTTCTGTG TCTTAATTGT CTGTAGACCT TGTAATCA6C ACATTGTCAC CCCAAGCCAT 780 

AGTCTCTCTC TTATTTGTAT CCTAAAAATA CGTACIATAA AGCTrTTGTT CACACACACT 840 

CTGAAGAATC CTGTAAGCCC CTQAATTAAG CAGAAAGTCT TCATGGCTTT TCTGGTCTTC 900 

GGCTGCTCA6 GGTTCATCTG AAGATTCQAA TGAAAAGAAA TGCATGTTTC CTGCTCTTCC 960 
CTCATTAAAT TGCTTTTAAT TCCA 



Seq ID NO: 46 Protein sequence: 
Protein Accession ft: NP_0 05407.1 

1 11 21 31 41 51 

I I I I 1 I 

MSSVQQKQTF TPPPQLQQQQ VKQPSQPPPQ EIPVPTTKEP CHSKVPQPGN T3CXPEPGCTK 60 

VPEPGCTKVP BPGCTKVPEP GCTKVPEPGC TKVPEPGCTK VPEPGYTKVP BPGSIKVPDQ 120 

GFIKFPEPGA IKVPBQOrPK VPVPGYTKLP EPCPSTVTP6 PAQQRTKQR 



Seq ID NO: 47 DNA sequence 

Nucleic Acid Accession ft: Eos sequence 



206 



wo 02/086443 

1 11 21 31 41 51 

I i I I I I 

GCGTCGTGTG CAGGCGTCCC CGGGCTGTG3 ATAATTAGAC AOSTTCTTCC CTCATTGCCC 60 

AAGGCTCGTT AGAATTCGCC CTAGAGCTGT ATCATGTATT TTCTTTCS^AA TTAACTTTGC 120 

TTGCAATTAA GCTTAQGGAA CCAGCAACAA AA6C3\AACTT GGCCCQAGGT CGTTCACCGC 180 

GAAAATGGAT TAGAGAAACT TCTTCCCCGA TTTAAGGQGA AAGATTCCTG CGGCCAGOGC 240 

TTT6GGGAAA GTGCCCX36AC CGCAGAGGOO AOGACAQGGO AGCAGOAAGC TGCTCAOGOT 300 

AGTCGGCGTT GGCGGCA60G GTGGCCTTCC TCATCTGGGC GATGTGG6CT GCTAGAAGA6 360 

TAAGGATAAC ATCCTGGAAA TQACTTCTGT ACGGTTTGAG CCCAACTGCa CACTCATGAC 420 

TTGGAGCTGC CCTGTGGAGT TACAGTTTAC CAAACaCATT CATGAACATA ATCTCATTTA 480 
CTAAAAACTT T6TGAGAATT TTCTTTTACT AAAATTTTTT CTTATTACAA A 

Seg ID ZilO: 4S DMA sequence: 

Nucleic Acid Accession CAT cluster 

1 11 21 31 41 SI 

I I I I I 1 

TTCCAAATTT TTTTTTTTGT AATAAGAAAA AATTTTAGTA AAAGAAAATT CTCACAAAGT GO 

TTTTAOTAAA TGAGATTATO TTCATGAATQ TGTTTGGTAA ACTGTAACTC CACAGQGCAG 120 

CTCCAAGTCA TGA6TGTGCA GTTGGGCTCA AACOGTACAG AAGTCATTTC CAGGATGTTA 180 

TCCTTACTCT TCTOGGAGCC CACATCGCCX: AGATGAGGAA GGCCAGCXXT GCCX3C CAACG 240 

CCXSACTAC06 TGAGCAGCTT CCTGCTCCCC TGTCGTCGCC TCTGOGGTCG GGGCACTTTC 300 

CCCAAAGCGC TGGCCGCAGG AATCTTTCCC CTTAAATOGG GGAAGAAGTT TCTCTAATCC 360 

ATTTTCGCGG TGAACGACCT CX3GGCCAAGT TTGCTTTTGT TGCTGGTTCC CTAAGCTTAA 420 

TTGCAAGC3A AGTTAATTTG AAAGAAAATA CATGATACAG CTCTAGGGOG AATTCTAACG 480 

AQCCTTGGGC AATGAGGQAA GAACOTGTCT AGTTATCCAC AGOCCGGOGA OGCCTGCACA 540 
CGAOGCT 

Seq ID NO: 49 DNA sequence 

Nucleic Acid Accession H^i CAT cluster 

1 11 21 31 41 51 

111)11 

TCTTTCTTCT GCTGCTCGTT TGTCTCTCCT GTGCTCTTCT TCTTTCTTTC CCTCGCOGCT 60 

CCTGCCOACC TCTGTTGTCT CTTCTCTGAT GGCGGGGGGC GGGAGAAGCT GACCGGTGAG 120 

ACCGTAGACC CQAAACCATT GGGTGTCACA AGCCGGTCGC CGGCTTTTTT GGGAGAACCC 180 

GACACATGCA GACCRGTTTT CCTGGAACNG CATGACCATG TTATTACTAT GGGCCGCCTC 240 

CCCAACCAAA GTGTTTAAAA CTTTTTAGGG CACCCCCAAA ATTTTTTTTT TTTTTTTTTT 300 

TTCATTTAAA AAACTCTAAT ATTTATATTA AATACAAAGA TAC CCAAACC CTTTATGCTT 360 

CTTTCTCTGA TCTGTGTCTT TTTTCTTTGA CAGCATCTCG ATTTTTTTTC TGCTGCTTCA 420 

T06CTGTAGC CATGGGAATC CGTTTCATTA TTATGGTAGC AATATGGAGT GCTGTATTCC 480 

TAAAGAAACT GACACAGGAG AATCACTTGA ACTTGGQAGG CAQAGTTTGC AGTGAGCCGA 540 

GATTGAACCA GTGCACTCCA GCCTTGGCAG -CGGAGCRAGA TTCTGTCACA GTTCCTGAAG 600 

TGCTGGTATC GTCCTGCAGC CCCATCCTCQ OTTCCATTOC GCTGCCMaOC AQ QGTO CTGG 660 

GACGTGG6GA GAGCTGGTCT ATATATCCOG GTOAAGCTCA GCTGTGGCAC ACCTTOaATG 720 

CCQGCTCTCT CCTGGCCCOS GGGACCTAOT ATTTTTQCCA CGAGT6TACA CCAAACAAAG 780 

GAQACAGCAT CATTTATGAG CCTGCAGCAT CCACCCTACT GCXGTATCCA GTTTOCATTG 840 
ACTG 

Seq ZD SOx 50 DNA sequence 
Nucleic Acid Accession #: Ii05187 
Coding sequence: 1991.. 2260 

1 H 21 31 41 51 

I I I I I I 

CTGCAGGGAG GCAGGTAGAA AAGGCTTTTQ GGTTTTCAGG TGGGGGGCAG TCTAGCCTGA 60 

TCAGAAAGGA GGAAAAGGCC AGGGCAGATG TCTGGGTGGA GTGAAGGGAA AAAGTGATCC 120 

CAGAAGAAGQ ATTAOCCCCT GAAAOTOCCT GAAGTAQ6AG AA6GGTAAAG GTGTGGTTGG 180 

TGAAOf&UUUS CAGOTTTTCC CAGATTAOCA ACCAGTCAGG GGGAGGAAGG TGAGAGTGGG 240 

AGAGTCATAA GTAAATTATT CT6AATGTGT GTAGTTTAAT GGAATTGGGA AAAAGATGGG 300 

GQAAATGGAT GGAAGGTCTT GGACTCTGAG ACftAGGGGTC TATAATCAGT CCATTTCATT 360 

ATTTCTAGCT TCCACCTTCA CCAAGGCAGA CAAGGAQGGC CCACCTCAGC TCCTCTGCTC 420 

CCCCTCCCTT TCOCACCTAT TCATGTGTGC AAGAGTGCCC TGTCCCACAG AACAGGGOGA 480 

ACAACCATCT CAATGACAAG QACAGCAGGT GGCAAGGCTC AACAGGACTC AGATGTCCCC 540 

OCAGGGTTAA CTCATGAAAC CCTCCATGAA 6CCTGCTGCT CACCCCTCCC TCAAGGCAAG 600 

CCCTGCACCT GGGTCTGAGQ AT6AGGGTG0 CAGTQAAAAT TAGGCCAGT6 ACATCATTTT 660 

CAGCCAGCTA 6TGCCAAAAA ATATCAGGTG GTGTTCATCA AATAAGCC6A GCCAACC8GT 720 

GATGAGQATG GTAOTGrOAG TCATQTGT6A CAGGTGAGGA ATGAAAACAG AGTGCCCGA6 780 

AGCTTCTATT TCCTT6AGGC AGGGCTCATT CATCTTATAA AAGCCAGCTG GCCATTGCCT 840 

TCACACCAAA CCCAAGGGAC CACACAGCCC ATTCTGCTCC GTATACCAGG TAAGTCTCTG 900 

ATTGCAACAA ACTGGCAAtT CTAGTGTACT TTTTCATTAT TAGAAATTAG CTAAAGGCAA 960 

ATATCTGTAA GCAGGTTAAT CCAGGGTTTC AATGGGAGAT AQAQAATAGT GOAATATCTT 1020 

TATTTTAAGT TAAATTACAG TCTGGATTTG AAAGGACCTT AGAGATGGTT AGGGCTCCCA 1080 

CCTCAGTAGA TAGTCATTGA ACTGGGAGTC CTGGAGAAGA TTGTTCAAAT GCCCATGGGA 1140 

AGTTCATAGC AQAACTAGAA CTCAGGCCAG AGCACTCTCA GTAACACTGC AATTTCCCCC 1200 

TGACAAGATA TTTATAGAAA TTTTAATTTA TTAGATGGAT CTCTACTGAG CATTTATTCC 1260 

ATTTAAGGCA GTATGCTAGG CRCTTTOQAC AAATCAATGC CCTAACGTAC TTACTTAACA 1320 

AACATAAAAC CTAGCAGGAA GGTAATACAT ATATATAAAT AAATGAAATG CAAAGTAGAT 1380 

AGTAATTGGC ATGACGGAGA TGGGCAOAGA AGG6CTGTOC ACTTTTGGGA GACTT6CTCA 1440 

AGGAGACCTC TAGGGTGTCA AGTQATGTQA GCTATGATGG AGGGGTATTT GGACAAGCAG 1500 

AGATGGGAAG AAAAGCATTT GGAAGGGACT GTGTAA6CAC AGACCAGAA6 CAAAACCATA 1560 

GAGGCTTA6A TGAATATAAA GCCATCCTAT AAGTCACAGG CTTTCTACAT GGTACTAGQA 1620 

GAGGAAAGTG GTCTGATGOC ATTTTCCAAA AGACCTAATA T6CGGACCTC ATGTOOCTCA 1680 

GAAGCCAGCT TTAGTAGGGC ATTTTTCCAG AACAGATATA AGGTGCCTTG GGTAGGAAGG 1740 

QAGCCAAGAA GAGAACTCCA ATAAAATGGA GCAGAAGAAA TTGCCTTTTA GCTCCTCCTC 1800 

TTCAAAGGGC CTGAAAATTA TCCAAGCTTA TTTCATTTTT AAATGTAATG GGGGAGCTAA 1860 
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GGGAGATGAA AGGCTTTCTC TTCTAAAGGG 
TTGTATCaiT CTTTCTTTAA TTGAATCACT 
CATTTGAAGC ATGAATTCTC AGCAGCAGAA 
GCA6CA6CA6 6T6AAACAAC CTTGCCA6CC 
CAAGGAGCCC TGCCAACCCA AGGTGCCTGA 
CCAGCCCAAG ATTCCAGAGC CCTGCCAGCC 
CACTCCAGCA CCAGCCCAGC AOAAGACCAA 
TTGAGGAGCT GGCCACTGGA TACTGAACAC 
GCCTATTQAC CCTGCAQTTA GCAT6CTGTC 
CTAAAAAQAT GTCCCTTACC CTCATTCTOG 
GTCTCACTGA CTGAGCTAGT CTTCTTGTTG 
AGGTCAAGTG ACCATCCCTA G 

Seq ID 270: 51 Protein sequence i 
Protein Accession fttAAC26638 

1 11 



PCT/US02/12476 



TCC^TGAAATA 
GTGTCAGCTT 
GCAGCCTTGC 
TCCACCCCAG 
GCCCTGCCAC 
CAAGGTGCCT 
GCAGAAGTAA 
CCTACTCCAT 
ACCCT6AATC 
AGGCrCCTGA 
CTGGGCTGCA 



AAATCTGTTT 
TCTGTCTCTA 
ACCCCACCCC 
GAACXAT6CA 
CCCAAAOTGC 
GA6CCCTGCC 
TOTG6TCCAC 
TCTGCTTATG 
ATAATCGCTC 
6CCTCTG06T 
TTTGAG6ATG 



GGCATTGAAT 
6AAAAAAACA 
CTCA3CCTCA 
TCCOCAAAAC 
CTGA6CCXTG 
CTTCAACGGT 
AGCCATGCCC 
AATCXrCATTT 
CTTTGCACCT 
AAGGCTOAAC 
GATTTGG^A 



41 



51 



CAATACASCT 
GCTOGACTGC 
TGAG66CCA6 
AGGCAGCTCT 
TCAATGGACA 
06CAAQAGCC 
TCC06T60GC 
TCAAGAAGTG 
CXWTCCTTGC 
TGCTGCCCTT 
GAGCTGCCTC 



11 
I 

AAGGAATTAT 
ATAAAQATTG 
CAGCTTCTTG 
CAOGGGAGTT 
AGATCCCGTT 
AGTCAAAGGT 
CATGTTGcAAT 
CT6TOAAGGC 
TGCACCTQTO 
CCCCTTCCCA 
TCTCATCCAC 



21 

I 

CCCTTGTAAA 
GTATGGCCTT 
ATC6T6GTGG 
CCTGTTAAAG 
AAAGGACAAG 
CCAOTCTCXS^ 
CCGCCTAAOC 
TCTTGCGGCa 
CC3GTCCCCAG 
CACTGTCCAT 
TTTCCAATAA 



31 

I 

TACCACAGAC 
AGCTCTTAQC 
TGTTCCTCAT 
QTCAAOACAC 
TTTCAGTTAA 
CTAAGCXrrGG 
QCTGCTTQAA 
TGOCCTGTTT 
A6CTACAGGC 
TCTTCCTOCC 
A 



41 
I 

CCX3CX:CTGGA 
CAAACACCTT 
06CTG6GAC6 
TGTCftAAGGC 
AGGTCAAGAT 
CTCCTGCCCC 
AGATACTGAC 
CGTTCCCCAG 
CGCATCTOGT 
ATTCAGGAT6 



Seq ID NO: 53 Protein sequence: 
Protein Accession ft: NP_002629.l 

11 



51 



GGCAOGAGCC 
GAGACAACCA 
ATCAATCAAT 
CCXTTCAGGG 
TTGCTGTTAT 
ATTT66GAAT 
CATT6CAGCT 
CCTTCCTTTT 
CGGACTGOTT 
GGAAGTCATA 
GCAGCTTGGT 
AGTQTCATTT 
TAATGAAGAA 
GGAGAGCTGG 
CTGCATGAGT 
TQAAGATGCT 
CTCTGTTTCT 
CCAATATACC 
TAATTCTTGT 
AATAAACTTT 



11 

1 

AGGATTCAGT 
CACTATGEAGA 
GTGTAAACCT 
TCAGAACCTT 
CACATGCAAG 
CCAGAATCCA 
AAAAGAGCAG 
CTACCGTGCX: 
CATTGCCTCC 
CAACACTGCC 
CTTTGTCTTA 
TCAOGCTGGT 
GAAGCAATTA 
GTGGTATAAG 
GACTTTAAGA 
TCAGAGCTCA 
GTTTTGCTTT 
TCATTGTGTG 
GTTAAGTTAA 
6TGTATTTAT 



21 
I 

CCCCTGGACT 
GGCACTCCA6 
ATTACTGGGA 
GTGGCAGTTC 
TATCCAGA6G 
GAAATGTGTT 
AAGATCATGG 
AAGACTGGTA 
TCCAAGAGAG 
TTTQAATTAA 
AAGTTTCTGG 
GCTGAGACAG 
CTTCATAGCA 
GCTGTCCTCT 
CTCAAAGACC 
TGCGC6TTAC 
ATTCCCTCTT 
TAATAGAACC 
ATCATTTTTG 
ATAATAAAAA 



31 
I 

6TAGATAAAG 
GAGAC6CTGA 
CTATTAATGA 
CACX3AAGTQA 
CTCTTGAGCA 
TGTATTGTQA 
ATCT6TATGG 
GQACCTCCAC 
ACCAGCCCAT 
ATATAAATGA 
TTCCCAATGT 
GGGCAAGGCT 
ACTGAA6AAC 
CAAGCTG0T6 
AAACACTGAG 
CCACGATGGC 
GGGATGATAT 
TTCTTAQCAT 
TCCTAATTGT 
AAAAAAAAAA 



41 



ACCCTTTCTT 
TGGTGGAGGA 
TTTGAATCAG 
CAGTGTGACC 
AGGCAGAG6G 
GAA6GTTGGA 
CCAACCCX3AG 
CCTTGAGTCT 
CATTCTGACT 
CTGAACTCAG 
GTTTTOGTCT 
GCTGTTATCA 
AGQATGTGQC 
CTGTGTAGGC 
CTTTCTTCTA 
ATGACTAGCA 
CATCCAGTCT 
TAAGACCTT6 
AATGTGTAAT 



51 

1 

GCCAGGTGCT 
AGGGCOGTCT 
CAAGTGTGGA 
CCAGTCACTG 
GATCCCATTT 
GAACAGCCCA 
CCCGTGAAAC 
GTGGCCTTCC 
TCAGAACTTG 
CCTAGAGGTG 
ACATTTTCTT 
TCTCATTTTA 
CTCAGAA6CA 
CACAAGGCAT 
GGG6TGGGTA 
CAGAGCTGAT 
TTATATGTTG 
TAAACAAAAA 
CTTAAAGTTA 



Seq ID NOt 55 Protein sequence: 
Protein Accession #: NP_062564 

1 . II 



21 31 41 51 

1111 
MEU3TPGDADG GGRAVYQSMC KPITGTIMDL NQQVNTLQGQ NLVAVPRSDS VTPVTVAVIT 
CKYPEALEQG RGDPIYLGIQ NPEKCLYCBK VGEQPTLQLK EQKIMDLYGQ PEPVKPFLFY 
RAKT(SITSTL BSVAFPDHFI ASSKRDQPII LTSELGKSYN TAFELNIND 

Seq ID NO: 56 DNA sequence 
Nucleic Acid Accession #: MM_00312S 
Coding sequence: 65-334 



1920 
1960 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 



21 31 

j I I I 1 i 

MNSQQQKQPC TPPPQPQQQQ VKQPCQFPPQ EPCIPKTKEP CQPKVPBPCK PKVPEPOQPK 
ZPGPCQPKVP EPCPSTVTPA PAQQKTKQK 



Seq ID NOt 52 DNA sequence 

Nucleic Acid Accession #: NM_002638.1 

Coding sequence! 120-473 



51 
1 

GCCAGGCCAA 
CCTGACACCA 
CTGGTTCTAO 
CQTCTTOCAT 
AAAGTCAAAG 
ATTATC7TGA 
TGCCCAGGAA 
TGMGGGAGC 
CCTAAGTCCC 
CCCA06GCTG 



21 31 41 

I I I I I I 

MRA88FLIW VFLIAGTLVL BAAVTGVPVK GQDTVKGRVP FNQQDFVKGQ VSVKOQDKVK 
AiQEPVKGPVS TKPGSCPIIL IRCAMXdTPPN KCLKHTDCPO IKKOCEGSCG KACFVFQ 

Seq ID NO: 54 DMA sequence 
Nucleic Acid Accession #: MM_019618 
Coding sequence: 75-584 ~ 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 



60 



60 
120 
180 
240 
3O0 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
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1 H 21 31 .41 51 

I I 1 I 1.1 

AGCAGTTCTA AGGGACCATA CAGAGTATTC CTCTCTTCAC ACCAGGACCA GCCACTGTTG 60 

CAGCATGAOT TCCCAGCAGC AGAAGCAGCC CTGCATCCCA CCCCCTCAGC TTCAGCAGCA 120 

GCAGGTGAAA C3«SCCTTGCC AGCCTCCACC TCftGGAACCA TGCATCCC^ IBO 

GCCCTGGCAC CCCAAGQTGC CT6AGCCCTG CCACCCCAAA GTGCCTGAGC CCTGCCAGCC 240 

CWM3CTTCCA QAGCCRTGCC ACCOC3UU3GT GCX3WW50CC TOCXICTTCSA TAGTCACTCC 300 

AGCACCAGGC CAGCAGAAGA CCAAGCAGAA GTAATGTGGT CCACAGCCAT GCCCTTGAGG 360 

AGCOGGCCAC CAGATGCTGA ATCCCXTTATC CCATTCTGTG TATGAGTCCC ATTTGCCTTG 420 

C3U^TTAGCAT TCIGTCTCCC CXAAAAAAGA ATGTGCTATG AAGCTTTCTT TCCTACACAC 480 

TCIGAGTCTC TGAATGAAOC TGAAGGTCTT AGTACCAGAO CTAGTTTTCA OCTGCTGAGA 540 

ATTCATCTGR AGAGAGACTT AAGATQAAAQ CAAATGATTC AGCTCCCTTA TACCCCCATT 600 
AAATTCACTT TCAATTCCA 



Seq ID KOt 57 Protein sequence i 
Protein Accession #t NF_003116 

1 11 21 31 41 51 

I I I I I 1 

MSSQQQKQPC IPPPQLQQQQ VKQPCQPPPQ BPCIPKTKEP CHPKVPBPCH PKVPBPCQPK 60 

LPEPCHPKVP BPCPSrVTPA PAQQKTKQK 

Seq ID NOi 58 DNA sequence 

Nucleic Acid Accession ft: MM_001793.2 

Coding sequence: 71-2560 

1 11 21 31 41 51 

I I I i I i 

AAAGGGGCAA GAGCTGAGCG GAACACCGGC CCGCCGTCX3C GGCAGCTGCT TCACCCCTCT 60 

CTCTQCA6CC ATGGGGCTCC CTOSTGGACC TCTCGCGTCT CTCCTCCTTC TCCAGGTTTG 120 

CTGGCTGCAG TGCGCGGCCT CCGAGCOGTG COGGGCQGTC TTCAQGGAGG CTGAAGTGAC 180 

CTTGOAGGOG GGAGGCX3CGG AOCAGGAGCC OGGCCAGGCG CTGGGGAAAC TATTCATQGG 240 

CTQCCCTGGG CAAGAGCXIAG CTCrGTTTAO CACTGATAAT GATGACTTCA CTGTGCGGAA 300 

TGGCGAGACA GTCCAGGAAA GAAGGTCACT GAAGQAAAGG AATCCATTGA AGATCTTCCC 360 

ATCCAAACGT ATCTTACXyUV GACACAAGAQ AGATTGGGTG GTTGCTCCAA TATCTGTCCC 420 

TGAAAATGGC AAGGGTCCXTT TCCCCCRGAG ACTGAATCAG CTCAAGTCTA ATAA AQATA G 480 

AGACACCAAG ATTTTCTACA GCATCACGOG GCCOGGGGCA GACAGCCOCC CTGAGGOTGT 540 

CTTCGCTGTA GAGAAGQAGA CAOOCTGGTT GTTGTTGAAT AAQCCACTG6 ACXMGOAGGA 600 

GATTGCCAAG TATGAGCTCT TTGGCCACGC TGTGTCAGAG AATGGTGCCT CAGTGGAQGA 660 

CCCCATQAAC ATCTCCATCA TCGTGACOSA CCAGAATQAC CACAAGCCCA AGTTTACCCA 720 

GOACACCTTC CX3A6QGAGTG TCTTAGAG6G AGTCCTACCA GGTACTTCTQ TGATGCAGGT 780 

GACA6CCAC3Q GATGAGGATG ATGCCArCTA CACCTACAAT QOOSTGOTtG CTTACTCCAT 840 

OCATAGCCAA GAACCAAAGG ACCCACAOGA CCTCATGTTC ACCATTCACC GQAGCACAGO 900 

CACCATCAGC GTCATCTCCA GTGGCeTGGA CCXSGGAAAAA GTCCCTGAGT ACACACTGAC 960 

CATCCAGGCC ACAGACATGG ATGGG6ACGQ CTCCACCACC ACGGCAGTGG CAGTAGTQGA 1020 

GATCCTTGAT GCCAATGACA ATGCTCCXaT GTTTGACCCC CAGAAGTACG AGOCCCATGT 10 BO 

QCCTGAOAAT GCAGTGQQCC ATGAflOTGCA GAGGCTGAGG GTCACTGATC TGGACGCCCC 1140 

CAACTCACCA GCQTG6CGTQ CCACCTAOCT TATCATGGGC QGTaACX3AC3G GQaACCATTr 1200 

TACCATCACC ACCXaCCCTG AGAGCAACCA GGGCATCCTG ACAACCAGGA AGGGTTTGGA 1260 

TTTTGAGGCC AAAAACCAGC ACACCCTGTA CGTTGAA6TG ACCAACGAGG CCCCTTTTGT 1320 

GCTOAAGCTC CCAACCTCCA CAGCCACCAT AGTGGTCCAC GTGGA6GAT6 TGAATGAGGC 1380 

ACCTGTQTTT GTCCCACCCT CCAAAGTOGT TGAGOTCCAO ORGGGCATCC CCACTGGGGA 1440 

QCCTGTGTGT GTCTAC31CTG CAGAAGACCC TGACAA6GAG AATCAAAAGA TCAGCTACOG 1500 

CATCCTOAGA GACCCAGCAG GGTG6CTAGC CATGGACCO^ QACAQTGQGC AGGTCACAGC 1560 

TGTGGGCACC CTCX5ACCGTG AGGATGAGCA GTTTQT6AGG AACAACATCT ATGAAGTCAT 1620 

GGTCTTGGCC ATGGACAATG GAAGCCCTCC CACCACTGQC AOGGGAACCC TTCTGCTAAC 1680 

ACTGATTGAT GTC»ATGACC ATOGCCCAGT CCCTGAGCCC CXSTCAGATCA CCATCTGCAA 1740 

CCAAAGCCCT GTGCGCCAGG TGCTQAACAT CACGQACAAG GACCTGTCTC COCACACCTC IBOO 

CCCTTTCCAG GCCCAGCTCA CAGATQACTC AGACATCTAC TGGACX3GCAG AGGTCAACGA 1860 

GGAAGGTGAC ACAGTGGTCT TGTCCCTGAA GAAGTTCCTG AAGCAGGATA CATATGACGT 1920 

GCACCTTTCT CTGTCTGACC ATGGCAACAA AGAGCAGCT6 ACGGTGATCA OGGCCACTGT 1980 

GTGCGACTGC CATGQCCATG TC3GAAACCTG CCCTGGACCC TGGAAGGGAG GTTTCATCCT 2040 

CCCTGTGCTG OGOGCTGTCC TOGCTCTGCT GXTCXTTCCTG CTGGTGCTGC TTTTGTTGGT 2100 

6AGAAAGAAG CGGAAQATCA AGGAGCCCCT CCTACTCCCA QAAGATGACA CXXX3TGACAA 2160 

CGTCTTCTAC TATGGCGAAG AGGGGGGTGG CX3AAGAGGAC CA6GACIATG AC31TCACCCA 2220 

GCTCCAC06A GGTCTGGAGG CCAGGCCGGA GGTGGTTCTC C6CAATGA0G TGGCACCAAC 2280 

CATCATCCOS ACACCCATGT ACCGTCCTOG GCCAGCCAAC CCAGATQAAA TCGGCAACTT 2340 

TATAATTGA6 AACCTGAAGG CX3GCTAACAC AGACCCCACA QCCCCGCTCT ACGACACCCT 2400 

CTTGGTGTTC GACTATGAGG GCAGCGGCTC 0GACGC06CG TCCCTGAGCT CCCTCACCTC 2460 

CTCCGCCTCC GACCAAGACC AAGATTACQA TTATCTGAAC QAOTGGGGCA GCGGCTTCAA 2520 

GAAGCTGGCA GACATGTACG GTGGCGGGGA GGACGACTAG GCGGCCT6CC TGCAGGGCTG 2580 

GGGACCAAAC GTCAGGCX».C AGAGCATCTC CAAGGGGTCT CAGTTCCCCC TTCAGCTGAG 2640 

GACTTCGGAG CTTGTCAGGA AGTGGCCGTA GCAACTTGGC GGAQACAGGC TATGAGTCTG 2700 

ACCTTAGAGT GGTTGCTTCC TTAGCCrTTC AGGATGGAGG AATGTGQGCA GrTTGACTTC 2760 

AGCACTGAAA ACCTCTCCAC CIOGGCCAGG GTTGCCTCAa AOQOCAAGTT TCCAQAASCC 2820 

TCTTACCTQC GGTAAAATGC TCAACCCTGT QTCCTGOGCC TGG6CCTGCT GTOACTQACC 2880 

TACAGTGGAC TTTCTCTCTG GAATGGAACC TTCTTAGGCC TCCTGGTGCA ACTTAATTTT 2940 

TTTTTTTAAT GCTATCTTCA AAACGTTAGA GAAAGTTCTT CAAAAGTGCA GCCCAGAGCT 3000 

OCTGGGCCCS^ CTQOCaSTCC TGCATTTCTG GTTTCCA6AC CCC3UIT GCCT CCCATT0G6A 3060 

TGQATCTCTO CGTTTTTATA CTGAOTGTGC CTAG QTTGC C CCITATTTTT TATTTTCOCT 3120 

GTTGCGTTGC TATAGATGAA QGGTQAGQAC AATCOTGTAT ATGTACTAGA ACTTTTTTAT 3180 
TAAAGAAACT TTTCCCAGAA AAAAA 

Seq ID NO: 59 Protein sequence: 



209 



wo 02/086443 
Protein Acceasion #: NP_001784.2 

1 11 21 31 41 51 

I I 1 I I I 

MGI«PRGPEiAS LLLI^VCWLQ CAASEPCRAV FREAEVTLBA GGAEQBPQQA tJBKVFMXXPG 60 

QEPALPSTDN DDPTVRMQET VQERRSLKBR MPLKIPPSKR ILRRHKRDWV VAPISVPENQ 120 

KOPPPQRLNQ LKSNKDRDTK IFYSITSPGA DSPPEGVFAV EKBTGWIiLLK KPLDREEIAK 180 

YELPGHAVSB NGASVEDPMN ISIIVTDQND HKPKFTQDTP RGSVLEGVLP GTSVMQVTAT 240 

DEDDAIYTYN QWAYSIHSQ BPKDPHDLMP TIHRSTGTIS VISSGLDREK VPEYTLTIQA 300 

TDMDGDGSTT TAVAWEILD ANX3NAPMFDP QXYEAHVPEH AVGSBVQBLT VTDLOAFNSP 360 

AWRATYIilMG QDDGDHPTIT THPBSNQGIL TTRKGLDFEA KNQHTLYVBV TNEAPFVLKL 420 

PTSTATIWH VEDVNEAPVF VPPSKWBVQ EGIPTGEPVC VYTAEDPDKB NQKISYRILR 480 

DPAGIfLAMDP DSGQVTAVGT LDREDEQPVR NMIYBVMVLA MDNGSPPTTG TGTLLLTLID 540 

VNDRGFVP5P RQZTICNQSP VRQVLNiroX DZ«SPHTSPFO AQLTDDSDIY WTAEVNBEOD 600 

TWLSLRKFL HQDmVHLS LSDHGNKEQL TVXRATVCDC HGHVETCPOT WXOGFIIiPVL 660 

GAVLALLFLXj LVLliLLVRKK RKIKEPLIiLP EDDTRDNVFY YGEEGG6EBD QDYDITQLHR 720 

GLEARPEWL RNDVAPTIIP TPNYtlPRPAH PDBZGNFIIB NLKAAimsPT APPYDTIiLVP 780 
DYB6SGSDAA SLSSLTSSAS DQDQOYDYLH EWGSRPKKLA 0MY6GGEDD 

Seq ID HOt 60 SNA sequence 

Nucleic Acid Accession #: Eos sequence 

Coding sequence: 162-428 

1 11 21 31 41 51 

I I I I t 1 

GOGTTCOQTT G6CGGC6GAT TOGAAGGTTC GGACTGAGGT TTTTCT6CCT GAAOAAOaST 60 

CATACGGACC GGATTGTTTT 0GCTGGCC3CA QTGTCCCCX3G AGCTTGTQTG CX3ATACAGAG 120 

AGCACCTOGG AAGCTGAGGC AGCTGGTACT TGACAGAQAQ GATGGOGCTG TCGACCATAG 180 

TCTCCCAGAG GAA6CAGATA AAGOGGAAGO CTCCGGQTGG CTTTCTAAAG GGAGTCTTCA 240 

AGCGAAAOAA GOCTCAACTT GGTCTOOAGA AAAOTGOTGA CTTATTGGTC CATCTGAACT 300 

6TTTACTGTT TGTTCATOGA TTAGCASAAG AGTCCAGGAC AAAiCGCTTGT GOGAGTAAAT 360 

GTA6AGTCAT TAACAAGOAG CATGTACTGG CG6CAGCAAA GGTAATTCTA AAGAAGAGCA 420 

GAGOTTAQAA GTCAAAOAAC ATATTCTTGA AAGTTATGAT GCATTCTTTT GGQTGGTAAC 480 
AGATOVTAAA GACA' mTlT ACACATCAST TAATATGGQA TTATTAAATA TTGG 

Seq ID NO: 61 Protein sequence: 
Protein Accession ft: Eos sequence 

1 11 21 31 41 SI 

I t . I I I I 

NALSriVSQR KQIKRKAPRG FLKRVFKRKK PQIiRLBKSQD LIiVBLiTCLZiF VBRZiABBSRT 60 

HACA8KCRVI NKEHVLAAAK VIlSXSStQ 

Seq ID NO: 62 DNA sequence 

Nucleic Acid Accession #i NM_000094.2 

Coding sequence t 99-8933 

1 II 21 31 41 51 

111)11 

GGGCTGGAGG G6CQCTGGGC TCGGACCTGC CAAGGCCftCC GCAGGGGGGA OCAAGGGACA 60 

GAGG06GGGG TCCTAGCTOA CGGCTTTTAC TGCCTAGGAT GAOSCTGCGG CTTCTGGTGG 120 

CCG0GCTCT6 CGCCGGGATC CTGGCAGAGG CGCCCOOAGT GCGAGCCCAG CACAGGGAGA 180 

GAGTGACCTG CACGOGCCTT TACGCCGCTG ACATTGTGTT CTTACTGGAT GGCTCCTCAT 240 

CCATTGGCCG CAGCAATTTC CGCQAGGTCC QCAGCTTTCT OGAAGGGCTG GTGCTGCCTT 300 

TCTCTGQAGC AGCCAGT6CA CAQGGTGTGC GCTTTGCCAC AGTGCAGTAC AGCGATQACC 360 

CACGGACAOA QTTCGGCCTG QATGCACTTG OCTCTGGGGG TGATGTGATC OGCGCCATCC 420 

GTGAGCTTAG CTACRAGGGG GGCAACACTC GCACRGQGGC TGCAATTCTC CATGTGGCT6 480 

ACCATGTCTT CCTQCCCCAO CTGGCCCGAC CTGGTGTCCC CAAGGTCTGC ATCCTGATCA 540 

CAGAC6GGAA GTCCCAGGAC CTGGTGGACA CAGCTGCCCA AA6GCTGAAG GGQCAGGG6G 600 

TCAAOCTATT TGCTGTGGGG ATCAAGAATG CTGACCCTQA GGA6CTGAAG OQAGTTGCCT 660 

CACAGCCAAC CTCCGACTTC TTCTTCTTCG TCAATGACTT CAGCATCTTG AGQACACTAC 720 

TGCCCCTCGT TTCCCGQAGA GTGTGCA06A CTGCTGGTGG COTGCCTOTG ACCOGACCTC 780 

CGGATGACTC GACXTTCTGCT CGAOGAGAOC TGSTaCZaTC TOAGCCAAGC AOCCAATCCT 840 

TGAGAGTACA GTGGACAGCG GGCAGTGGCC CTGTGACTGG CTACAAGQTC CA6TACACTC 900 

CrCTGACGGG GCTGGGACAG Ca^CTGCOSA GTGAGCGGCA GGAGGTGAAC GTCCCAGCTG 960 

GTGAGACCAQ TGTGOGGCTG OGGGGTCTCC QGCCACTGAC CGAGTACCAA GTGACTGTGA 1020 

TTGCCCTCTA CGCCAACAGC ATCQGGGAGG CTGTGAGCQG GACA6CTCGG ACCACTGCCC 1080 

TAGAAGGGCC GQAACTQAOC ATGCAGAATA CCAGAGCOCA CAGCCTCCT6 GTGOCCTGGC 1140 

GGAGT6TGCC AGOTGCCACT GGCTACGQT6 TGACATGQCG GGTOCTCABT GGTGG6CCCA 1200 

CACAGCAGCA GGAGCTOaGC CCTGGGCAGG GTTCAGTGTT GCTGCGTGAC TTGGAGCCTG 1260 

GCACGOACTA TOAGGTGACC GTGAGCACCC TATTTGGCOG CAGTGT6GGG CCCGCCACTT 1320 

CCCTGATGGC TC6CACTGAC GCTTCTGTTG AGCA6ACCCT GCGCCCG6TC ATCCTGGGCC 1380 

OCACATCCAT CCTC&TTTCC TGQAACTT66 TGCCTOAOQC CGGTGGCTAC CGGTTGGAAX 1440 

GGCGGCGTGA GACTGGCTTG 6A6CCAC0GC AGAAGGTGGT ACTGCCCTCT GATGTGACCC 1500 

GCTACCAGTT GGATGGGCTG C3W3CC3GGGCA CTQAGTACOG OCTCACACTC TACACTCTGC 1560 

TGGAGGGCCA CGAGGTGGCC ACCCCTGCAA CCGTGGTTCC CACTGGACCA GAGCTGCCTG 1620 

TGA6CCCTGT AACAQACCTG CAAGCCACOO AGCT6CCC6G GCA6CGG6TG GGAGTGTCCT 1680 

GGAGCCCA8T CCCTGGT60C AGCCAGTACC GCATCATTGT GCGCA6CACC CAGGOGGTTQ 1740 

A6CGGACCCT GGTGCTTCCT G6GA6TCA6A CAGCATTCQA GTT6GATGAC GTTCAGGCT6 1800 

GGCTTAGCTA CACTGTGCGG GTQTCTGCTC QAGTGGGTCC CCGTGAGGGC AGTGCCAGTG 1860 

TCCTCACTGT CCGCOGGGAG CCGGAAACTC CACTTGCTGT TCCAGGQCTG CGGGTTGTGG 1920 

TGTCAGATGC AACXSOGAGTG AGGGTGGOCT GGGGACCOGT CCCTGGAGCC AGTGGATTTC 1980 

GGATTA6CTG GAGCACAGGC AGTGGTCC6G AGTCCftGCCA GACACTGCCC CCAGACTCTA 2040 

CTGCCACftOA CATCACA606 CTGCA6CCT6 GAACCACCTA CCAGGT6GCT GTGTCGGTAC 2100 

TGCGAGGCAG AGAGGAGGGC CCTGCTGCAG TCATCGTGGC TCGAACGGAC CCACTGGGCC 2160 

CAGTGAGGAC GGTCCATGTG ACTCAGGCCA GCAGCTCATC TGTCACCATT ACCTGGACCA 2220 

6GGTTCCTG6 CGCCACA6GA TACAGGGTTT CCTGGCACTC AGCCCAGGGC CCAGAGAAAT 2280 
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CCCAOTrGGT TTCXGGGGAG GCCAOGGTGQ CTGAGCTGGA TGGACTGGAG CCAGATACTG 2340 

AOTATAOGOT GCMGTGW30 OCCCMOTGO CrCGCGTOOA TGGGCXXrCT GCCTCTGTGG 2400 

TTOTGAGOAC TGCCCCTGAG CCT0TG6GTC GTGTGTOaAG GCTGCAGATC CTCAATGCTT 2460 

CCAOCGACXST TCTACGQATC ACCTGGGTAQ GGGTCACTGG AGCCACAGCT TACAGACTGG 2520 

CCTGGGGCOG GAGTOAAGGC GGCCCCATGA GGCACCAGAT ACTCCCAGGA AACACAGACT 2580 

CTGCAQAGAT CCGGGGTCTC 6AA0GTGGAG TCAGCTACTC AGTGCGAGTG ACTQCACTTG 2640 

TCOGGGRCOG CGAGG6CACA CCTGTCTCCA TTGTTGTCAC TACGCCX3CCT QAGGCTCOGC 2700 

CAGCCCTGGG GACGCTTCAC GTGGTGCAGC GC3GGGGAeCA CTCGCTGAGG CTGCGCT6GG 2760 

AGCCGGTGCC CAGAGCGCAG QGCTTCCTTC TGCACTGGCA ACCTGAGGGT G6CCAGGAAC 2820 

AGTCCCGGGT CCTGGGGCCC QAGCTCAGCA GCTATCACCT GGACGGGCTG GRGCCAGOSA 2880 

CACAGTACCG CGTGAGGCTG A6TGTCCTA0 GGCCGGCTGG AGAAGGGCCC TCTGCAGAGG 2940 

T6ACTG0800 CACTGAOTCA CCTCGTOTTC CAAGCATTGA ACTACGTGTG GTGGACACCT 3000 

OC3ATCX3ACTC QOTGACTTTQ 00CT6QACTC CAGTGTCCAG GGCATCCAGC TACATCCTAT 3060 

CCTGGCGGCC ACTCAGAGGC CCTGGCCAGG AA6TGCCTGG GTCCCCGCAG ACACTTCCAG 3120 

6GATCTCAAQ CTCCCAGCGG GTGACAGQGC TAGAGCCTGG CGTCTCTTAC ATCTTCTCOC 3180 

TGACGCCTGT CCTGGATGGT GTGCGGGGTC CTGAGGCATC TGTCACACAO ACGCCAGTGT 3240 

GCCCOCSSTQQ CCTOOCGOAT GTGGTGTTCC TACCACATGC CACTCAAGAC AATGCTCACC 3300 

GTGCGSAGGC TAGCSAGGAGG GTCCTGGAGC GTCTGGTGTT GGCACTTGGG CCTCITGGGC 3360 

CACAGGCAGT TCAGGTTGGC CTGCTGTCTT ACAGTCATOG GCCCTCCCXA CTGTTCCCAC 3420 

TGAATGGCTC CCATGACCTT GGCATTATCT TGCAAAGGAT COGTOMATS CCCTACATGG 3480 

ACCCAAGTGG GAACAACCTG QGCACAGCCX3 TGGTCACAGC TCACAGATAC ATGTTGGCRC 3540 

CRGATGCTCC TGGG06CCGC CAGCACGTAC CAGGGGTGAT GGTTCTGCTA GT6GATGAAC 3600 

CCTTGAGAGG TGACATATTC AGCCCCATCC GTGAGGCCCA GGCTTCTGGG CTTAATGTGG 3660 

TGATGTTGGG AATGGCTGGA GCGGACCCA6 AGCAQCTGCX3 TC3GCTTGGC0 CCGGGTATGG 3720 

ACTCTGTCCA GACCTTCTTC GCCGTOQATG ATGGGCCRAO CCTGGAOCAO GCAGTCAGTG 3780 

GTCTX3GCCAC AGCCCTGTGT CAGGCATCCT TCACTACTCA 6CCC066CCA GAGCCCTGCC 3840 

CAGTGTATTG TCCAAAGGGC CAGAAGGGGG AACCTGGAGA GATGGGCCTG AGAGGACAAG 3900 

TTGGGCCTCC TGGCGACCCT GGCCTCCCGG GCAGGACCGG TGCTCCCGGC CCCCAGGGGC 3960 

CCCCTGGAAG TGCCACTGCC AAGGGCGAGA GGGGCTTCCC TGGAGCAGAT GGGCGTCCAG 4020 

GCAGCCCTGG CCGCGCCGOG AATCCTOaGA CCCCTOGAOC CCCTQGCCTA AAGGGCTCTC 4080 

CAGGGTTOCC TGGCCCTCGT G06GACC060 OAGAGOSAGG ACCTCGAGOC CCAAAOGOGG 4140 

AGCCGGGGGC TCXXXMACAA GTCATCGGAG GTGAAGGACC TGGGCTTCCT GGGCQGAAAG 4200 

GGGACCCTGQ ACCATCGGGC CCCCCTGGAC CT0GT«3ACC ACTGGGGGAC CCAGGACCCC 4260 

QTGGCCCCCC AGGGCTTCCT GGAACAGCCA TGAAGGOTGA CAAAGOOGAT CGTGGGGAGC 4320 

GGGGTCCCXX: TGOACCftOGT GAAGGTGGCA TTGCTCCTGG GOAGCCTQGG CTGCCGGOTC 4380 

TTCCCGGAAG CCCTGGACCC CAAGGCCCCG TTGGCCCCCC TGCSftAAGAAA GGAGAAAAAG 4440 

GTGACTCTGA GGATGGAGCT CCAGGCCTCC CAGGACAACC TGGGTC TCCG GOTGAGCAGG 4500 

GCCCACGGGG ACCTCCTGGA GCTATTQGCX: CCAAAGGTGA CC2GGGGCTTT CCAGGGCCCC 4560 

TGGGTGAGGC TGGAGAGAAG GGGGAAOGTG 6ACCCCCAG6 CCCAGCGGGA TCCC6GGGGC 4620 

TGCCAGGGGT TGCTGGACGT CCTGSAGCCA AGGGTCCTGA AGGGCCACCA G6ACCCACTG 4680 

GCXX3CCAAGG AGftGAAGGGG GAGCCTGGTC GCCCTGGGGA CCCTGCAGtG 6TGGGACCTG 4740 

CTGTTGCTGG ACCCRAAGGA GAAAAGGGAG ATGTGGGGCC CGCTGGGCCC AGAGGAGCTA 4800 

CCGQAGTCCA AGGGGAACGG GGCCCACCCG GCTTGGTTCT TCCTGGAGAC CCTGGCCCCA 4860 

AGGGAGACCC TGGAGACCGG 6GTCCCATTG GCCTTACTGG CAGAGCAQGA CCCCCAGGTG 4920 

ACTCAGOQCC TCCTGOAGAO AAGGGAGACC CTGOGOGGCC TOGCOOCCCA GOACCTGTTG 4980 

GCCCCCOAGO ACQAGATGGT GAAOTrGGAG AGAAAGOTQA OGAGGGTCCT CCGGGTGACC 5040 

CGGGTTTGCC TGGAAAAGCA GGCGAGOGTG GCCTTCGGGG GGCACCTGGA GTTCGGGGGC 5100 

CTGTGGGTGA AAAGGGAGAC CAGGGAGATC CTGGAGAGGA TGQACQAAAT GGCAGCCCTG 5160 

GATCATCTGG ACCCAAGGQT GACCGTGGGQ A6CCGGGTCC CCCAGGACCC CCXXX3A0GGC 5220 

T66TAGACAC AGGACCTGGA GCCAOAGAGA A6GGAGAGCC TG6GGAC0GC GGACAAGAGG 5280 

GTCCTOGAGG OCCCAAGGGT GATCCTGGOC TCCCTGGAGC CCCTGGGGAA AGGGGCATTG 5340 

AAGGGTTTC6 GGGACCCCCA GGCCCACAGG GGGACCCAGG TGTCC3GAGGC CCAGCAGGAG 5400 

AAAAGGGTGA CCGGGGTCCC CCTGGGCTGG ATGGCCGGAG CG6ACTGGAT GGGAAACCAG 5460 

GAGCOGCTGG GCCCTCTGGG COGAATGGTG CTGCAGGCAA AGCXGGGGAC CCAGGGAQAG 5520 

AOGGGCTTCC AGGCXHTCCGT 6GAGAACAAG GCCTCCCTGG CCCCTCTGGT CCCCCTGGAT 5580 

TACCXSGOAAA GCCAG60GAG GATGQGAAAC CTGGCCT6AA TGGAAAAAAC GGA6AAGCTG 5640 

GGGACCCTGG AGAAGACGGG AGGAAGGGAG AGAAAGGAGA TTCAGGCGCC TCTGG6RGAG 5700 

AAGGTCX3TGA TGGCCCCAAG GGTGAGCGTC GAGCTCCTGG TATCCTTGGA COCXaOOQaC S760 

CTCCAGGCCT GCC3U3GGCCA GTGGGCCCTC CTGGCCAGGG TTTTCCTGGT GTCCCAGGAG 5820 

GCAjCaOGGCC GAAGGfTTGAC CGTGQGGAQA CTGQATCCRA AGQGQAGCAG GGCCTCCCTG 58B0 

GAGAOGGXGO CCTGCGAGGA GAGCCTGGAA GTGTGCQGAA TGTGGATOGG TTGCTGGAAA 5940 

CTGCTGGCAT CAAGGCATCT GCCCTGOGGG AQATCGTGGA OACCTGGGAT GAGAGCTCTG 6000 

GTAGCTTCCT GCCTGTGCCC GAACGGCGTC GAGGOCOCAA GOGOGACTCA OGOGAAC3U3G 6060 

GCCCCCCAGG CAAGGAGGGC CCGATOGQCr TTCCTGGAGA ACGCGG6CTG AAGGGC36ACC 6120 

GTGGAGACCC TGGCCCTCAG GGGCCACCTG 6TCTGGCCCT TGGGGAGAGG GGCCCCCCCG 61B0 

GGCCTTCXXK3 CCTTGCXX3GG GAGCCTGGAA AGCCTGGTAT TCCCGGGCTC CCAGGCAGGG 6240 

CTGGGGGTGT GGGAQAGGCA GGAAGGOCAG GAGAGAGGGG AGAAOGGGGA GAGAAAGGAG 6300 

AACG1X3QAGA ACAGGGCAGA GATGGCCCTC CTOGACTCCC TGOftACCOCT GG60CC0CC0 6360 
GACCCCCTGG CCCCAAGGTG TCTGTGGATG A6CCAGGTCC TGGACTCTCT GGAGAACAGG 6420 
GACCCCCTGG ACTCAAGGGT GCTAAGGGQG AGCCGGQCAG CAATGGTGAC CAW5GTCCCA 64 BO 
AAGGAGACAG GGGTGTGCCA GGCATCAAAG GAGACC6GGG AGAGCCTGGA CCGAGGGGTC 6540 
AGGACGGCAA CCOGGGTCTA CCAGGAGAGC GTGGTATGGC TGG6CCTGAA GGGAAGCCGG 6600 
GTCTGCAG6Q TCCAAGAGGC CCCCCTGGOC CAQTOGGTGG TCMGGAGAC CCTGGACCAC 6660 
CTGGTGCCCC GGGTCTTGCT GGCCCTGCAG GACCCCAAOG ACCTTCTQGC CTGAAGGGG6 6720 
AGCCTGGAGA GACAGGACCT CCAGGAOGGG GCCTQACTGG ACCTACTGGA GCTGTGGGAC 6780 
TTCCTGGACC CCCCGGCCCT TCAGGCCTTG TGGGTCCACA GGGGTCTCCA GGTTTGCCTG 6840 
GACAAGTGGG GGAGACAGGG AAGCOGGGAG CCCCAGGTCG AGATGGTGCC AGTGGAAAAG 6900 
ATGGAQACAG AGGGAGGCCT GGTGTOCCAG GGTCACCAG6 TCTGCCIGGC CCTGrCGGAC 6960 
CTAAAGGAOA ACCTGGCCCC ACGGGOGCCC CTGGACAGGC TCTGOTCOGG CTCCCTOGAG 7020 
CAAAGGGAGA GAAGGGAGCC CCTGGAGGCC TTGCTGGA6A CCTGGTGQGT GAGCCGGGAG 7080 
CCAAAGGTGA CCQAGQACTG CCAGGGCCGC GAGGCGAGAA GGGTGAAGCT GGCCGTGCAG 7140 
GGGAGCCCGG AGACCCTGGG GAAGATGGTC AGAAAGGGGC TCCAGGACCC AAAGGTTTCA 7200 
AGGGTGACCC AQGAQTCGGG GTCCCGG8CT CCCCTGQGCC TCCTGGCCCT CCAGGTGTGA 7260 
AGGGAGATCT GGGCCTCCCT GGCCTGCCC6 GTOCTCCTGG TGTTGTTGGG TTCCOGGGTC 7320 
AGACAGGCCC T06AGGAGAG ATG6GTCAGC CAGGCCCTAG TGGAGAGCGG GGTCTGGCAG 7380 
GCCCCCCAGG GAGAGAAGGA ATCCCAGGAC CCCTGGGGCC ACCT6GACCA CCGGGGTCAG 7440 
TGGQACCACC TGGGGCCTCT GGACTCAAAG GAGACAAGGG AGACCCTGGA GTAGGGCTGC 7500 
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CIGGGCCCCG AGGCX3AGOG7 GGGQAGCCAO GCHTCOGGGS TGAAOATGGC 06CCC0GGCC 7560 

AGGAGGOACC CCGAGGACTC ACX3GGGCCCC CTGGCAGCAO GGGAQAGCOT GGGGAQAAGO 7620 

OTGATGTTGQ GAOTGCAGQA CTAAAGGGTG ACAAGGGAGA CTCAGCTGTG ATCCTGGGGC 7680 

CTCCA66CCC AOSGGSTGCC AA6GG6GACA TCGGTGAAOG A6G6CCT06G GGCTTGOATG 7740 

5 GTGAaUAGO ACCTGGGGX3A QACAATQGGG ACGCTGaTQA CAAOGGCAGC AAGGGAGAGC 7800 

CT6GTGACAA GGGCTCASCC G6GTT6CCAG gACT G OST G G ACTCCTG6GA CGCCA6GGTC 7860 

AACCTGGTGC AGCAGGGATC CCTGGTGACC CGGGATCCCC AGGAAAGGAT GGAGTGCCTG 7920 

QTATCCGAGG AOAAAAAGGA GATGTTGGCT TCATQGGTCC CCGGGGCCTC AAGGGTGAAC 7980 

6GGGAGTGAA GGGAGCCTGT GGCCTT6ATG GAGAGAAGGG A6ACAA6GGA GAAGCTGGTC B040 

10 CCGCAiGGCOS COCCGGGCTG GCAGGACACA AAGGAGAGAT GGGOGAQCCT GGTGTOC0G8 BlOO 

6CCA6TGGGG GGCCCCTGGC AAQGAGGGCC TGATCGGTCC CAAGGOTGAC OSAG6CTTT6 S160 

AOGGGCAGCC AG6CCCCAAG GGTGACCAGG GCGAGAAAG6 GGAGGGGGGA ACCCCAGGAA 8220 

TTGGGGGCTT CCCAGGCCCC AGTGGAAATG ATGOCTCTOC TGOTCCCCXa GGGCCACCTO 82 BO 

GCASTGTTGG TCCCAOAGGC CCCGAAGGAC TTCAQG6CCA GAAGGGTGAO CGAGGTCCCC S340 

IS COGGAGAGAG AGTGOTGGGG GCTCCTOGGG TCCCTGGAOC TCCTGGOQAG AGAGGGGA6C 8400 

AG6GGCG6CC AGGGCCTGGC GGTCCTOGAG GCGAGAAGGG A6AAGCTGCA CTGAGGGAG6 8460 

ATGACATCCG QGGCTTTGTQ OGCCAAGAGA TGAGTCAGCA CTGTGCCTGC CAGGGCCAGT 8520 

TCATCGCATC TGGATCACGA CCCCTCCCTA GTTATGCTGC AGACACTGCC GGCTCCCAGC 8580 

_ - TCCATGCTGT QCCTGTGCTC OGCJQTCTCTC ATGCAGAGGA G6AAGAGCGG GTACCCCCTG 8640 

20 AGOATGATGA GTACTCT6AA TACTCOGMGT ATTCTtQTGGA OGAGTACCAG GACCamAO 8700 

CTCCTTGGGA TAGTGATGAC COCTGTTCCC TGCCACTQQA TOAOGOCTCC TGCA CTGCC T 8760 

ACACCCTG06 CTGGTACCAT CGGGCTGTGA CAGGCAGCAC AGAGGCCTGT CACCCTTTTG 8820 

TCTATGGTGG CTGTGGAGGG AATGCCAACC GTTTTGGGAC CCQTGAGQCC TQCQAGOGCC 8BB0 

_ _ GCTGCCCACC CCGGGTGGTC CAGAGCCAGG GGACAGGTAC TGCCCA6GAC TGAQGCCCAG 8940 

25 ATAATGAGCT OAGATTCAGC ATCGCCTGGA GaAOTGQGGG TCTCAQCAOA ACCCGACrOT 9000 

CCCTCCCCTT GGTGCTAGAG GCTTGTQTGC AOGTGAGGGT OGGAQTGCAC GTCCGTTATT 9060 

TCAGTGACTT GGTCCCGTGO GTCTAGCCTT CCCCCCTGTG GACAAACCCC CATTGTGGCT 9120 

CCTGCCACCC TGGCAGATGA CTCACTGTGG GGGGGTGQCT OTGGGCAGTG AGCX3GATGTG 91 BO 

ACTGGCGTCT GACXTOSCCGC TTGACCCAAG CCTGTGATGA CATGGTGCTG ATTCTQGGGG 9240 
30 GCATTAAAGC TGCTGTTTTA AAAGGCAAAA AA 

Seq ID MO: €3 Protein sequence: 
Protein Accession 1IP_000085.1 

3S 1 11 21 31 41 51 

I I I I I I 

MTLRIiLVAAIi CAGIIiAEAPR VRAQERERVT CTRIiYAADIV PLLDGSSSIG RSHFREVRSF 60 

LBGLVLPPSG AASACX3VRFA TVQYSODPRT EPOLDALGSG GDVIRAIRBL SYKGGNTRTO 120 

^ AAILHVADHV FLPQLARPGV PKVCILITDG KSQDLVDTAA QRUOGQGVKL PAVQIKNADP 180 

40 EELKRVASQP T8DFFPFVND FSILRTLLPL VSRRVCTTAG 6VFVTRPPDD 6TSAPRDLVL 240 

SBPS5QSLRV QHTAASQPVT GYKVQYTPLT GLGQPLPSBR QEVNVPAGET SVRLRGLRPL 30O 

TEYQVTVIAL YAHSIGKAVS GTARTTAIiEG PBLTIQNTTA HSU.VANRSV PGATGYRVTW 360 

RVLSGGPTQQ QELGPGQGSV LLRDLBPGTD' YEVTVSTLFG RSVGPATSLM ARTDASVEQT 420 

LRPVILQPTS ILLSWNLVPE ARGYRLEMRR ETGLEPPQKV VLPSDVTRYO LDGLQPGTBY 480 

45 RLTLYTLLEG HBVATPATW PTGPELPVSP VTDLQATELP GQRVRVSWSP VPGATQYRII 540 

VRSTQGVERT LVLPGSQTAF DLDDVQAGLS YTVRVSARVG PRBGSA8VLT VRREPETPLA 600 

VPGLRVWSD ATRVRVAWGP VPGASGFRIS H5TG86PE8S QTLPPDSTAT DITGLQPGTT 660 

YQVAVSVLRG REEGPAAVIV ARTDPLGPVR TVHVTQASSS SVTITHTRVP GATOyRVSWH 720 

- SAHGPEKSQIi VSGEATVAEL DGLEPDTBYT VHVRAHVA67 DGPPASVWR TAPEPVGRVS 780 

50 RLQIIiNASSD VLRITWVGVT GATAYRIxAWG RSEGGPMRHQ ILPGNTDSAE IRGIiEGGVSY 840 

SVRVTALVGD RBGTPVSIW TTPPEAPPAL GTIiHWQRaB HSUILRWBPV PRAQ6FLLHW 900 

QPEiGQQEQSR VLGPELSSYH LDGLEPATQY RVRLSVLGPA 6EX3PSAEVTA RTESPRVP8I 960 

ELRWDTSID SVTLAHTPV8 RA8SYILSNR PIAOPGQBVP GSPQTLPGZS S8QRVTGLEP 1020 

GVSYIPSLTP VUJGVRQPBA SVTQTPVCPR GLADWPLPH ATQDHAHRAE ATRRVLERLV 1080 

55 LALGPLGPQA VQVGLLSYSH RPSPLFPLNG SHDLGIILQR IRDMPYMDPS GNNLGTAWT 1140 

AHRYMLAFDA PGRRQHVPGV NVLLVDEPLR GDIPGPZREA QASGLNWML GMAGADPEQL 1200 

RSIlAPG^a^SV QTFFAVDDGP SLDQAVSGLA TALCQASFTT QPRPEPCPVY CPKGQKG^ 1260 

BMBLRGQVGP PGDPGZiPGRT GAPGPQGPPG SAtAKGERGF PGADGRP6SP GRAOHPOTPG 1320 

AFGLRGSFGL WGPRGDVGER GPRGPKt^IPG APGOVIGGEG PGLFGRKGDP GPSGPP6PRG 1380 

60 PLGDPGPRGP PGLPGTAMKG DKGDRGKRGP PGPGBGGIAP GEPGLPGLPG SPGPQGFVQP 1440 

PGKK6EKGDS EDGAPGLPGQ PG8PGEQGPR GPPGAIGPKG DRGFPGFLGE AOBKGERGPP 1500 

GPAGSRGLFG VAGRPGAKGP EGPPGFTGRQ GEKGEPGRPG DPAWGPAVA GPKGEK6DV0 1560 

PAGPRGATGV QGERGPPGLV LPGDPQPXCB) PGORGPXGLT ORAGPPGDSG PPGBKraPOR 1620 

^ P6PP6PVGPR GRDGEVGBKO DEGPP6DPGL PGKAGERGLR GAP6VR0PV6 ERGDQGDFGS 1680 

65 DGRNOSPGSS GPKGDRGSPG PFGPPGRLVD TGPGARBKGE PGDRGQEGPR GPKGDPGLPG 1740 

APGERGIBGF RGPPGPQGDP GVRGPAGEKG DRGPPGLDGR SGLDGKPGAA GPSGFMGAAG 1800 

KAGDPGRDGL P6LRGSQGLP GFSGPPGLPG KPGEDGKPGL KGKNGEPGDP GEDGRK6EKG 1860 

DSGASSIG6R DGPKOERGAP GILGPQGPFG LPGFVGPPGQ GFPGVPGOTO PKODRGETOS 1920 

KGBQGLFGBR 6LRGEP65VP NVDRLLBTAG IKASMmRBIV ETWDBSSGSF LPVPERRRGP 1980 

70 KGDSGEQGPP GKEGPIGFPG BRGLKGDRGD PGPQGPPGLA LGBRGPPGPS GLAGEPGKPG 2040 

IFGLPGRA6G VGEAGRPGER GERGEKGERG EQGRDGPPGIi PGTPGPPGPP GPECVSVDEPG 2100 

PGIiSGEQGFP GLKGAK6EPG SNGDQGPKGD RGVPGIKGOR QEPGPRGQDG NPGIiPGERGM 2160 

AGPEGKPGLQ OTRGPPGPVG GHGOPGPPOA FGLAOPAGPQ 6PSGLKGBPG ETGPPGRGLT 2320 

_ . . GPTGAVGLP6 PPGPSGLVGP Q6SP6LPGQV GBTGKP6APG SD6AS6RDGD RQSP8VPGSP 2280 

75 GLPGPVGPKG EPGPTGAPGQ AWGLPGAX6 EK8APGGLA0 DLVGEPGAKG DRGLPGFRGE 2340 

KGBAGRAOEP QDPGEOGQKG APGPKGFKGD PGVGVPGSPG PPGPPGVKGD LGLPCHjPGAP 2400 , 

GWGPP6QTG PRGEMGQPGP SQERGIiAGPP GREQIPGPLG PPGPPGSVGP PGASGLKGDK 2460 

GDPGVGIrPOP RGERGBFOIR OBDGRPGQBG PRGLTGPPGS RGBRGBKGDV GSABUOaiKG 2520 

^ DSAVILGPPG PRGAKODKGE RGPRGLDGDK GPRGDNGDPG DKGSRQEPGD KGSAGLFGLR 2580 

80 GLLGPQGQPG AAGIPGDPGS PGKDGVPGIR GEKGDVGPMG PRGLKGERGV KGACGLDGEK 2640 

GDXGEAGPPG RPGLAGHKOE MGEPGVPGQS GAPGKEGLIG PKCTRGFDGQ PGPKGDQGEK 2700 

GERGTPGIGG FPGPSGNDGS AGPPGPPGSV GPRGPEGLQG QKGERGPPGE RWGAPGVPG 2760 - 

APQERGEQGR PGPAOPRGEK GBAALTEDDZ RGFVRQEMSQ HCACQGQFIA 8GSRPXiPSYA 2820 

ADTAGSQLHA VPVLRV8BAE EBERVPPEDD EY8BY8BYSV EEYQDPBAPW DSDDPCSLPL 2880 

85 DBQSCTAYTL RHYBRAVrGS TEACHPFVYG OOSGNANRFG TREACBRRCP FRWQ8QGTG 2940 

TAQD 
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Seq ID NO: 64 DNA sequence 
Nucleic Acid Accession ft: HM_006945 
Coding sequence: 1-219 

1 11 21 31 41 51 

ATCTCTTATC AACA0CAGC3V OTGCAAGCAG CCXTTGCCAGC CIVCCTCCTGT QTGCCCCACO 60 

CCAAA6TGCC CAGAGCCATQ TCCACCCCCG AAGXGCCCTG AQCCCTGOOC ACCAOCRAAG 120 

TGTCCAiCaGC OCTSCCCACC TCAGCACSTGC CAGCRGAAAT ATCCTCCTGT GACAOCTTCC 180 
CXS^COCTOOC MCCMMBTA TCCACOGAAO AGCAAGTAA 

Seq ID NOt 65 Protein sequence t 
Protein Accession ft: NP_008876 

1 11 21 31 41 51 

MSYQQQQCKQ PCQPPPVCPT PKCPBPCPPP KCPEPCPPPK CPQPCPPQQC QQKYPPVTPS 60 
PPCOPKyPPK SK 

Seq ID NO: 66 DNA sequence 

Nucleic Acid Accession ft: NM_005629.1 

Coding sequence! 639-2546 

1 11 21 31 41 51 

TAGTCGOAGC GAGGTGGCGA GTCGCTGAGC CGGCCGCGGC CCCGAGAGCG GCTGCAGCCO 60 

COGOCGCCQO GAAGGAGAGO GCC5AGGCGCG CCCGAGCCOC CGCOGCCGCC GCCACCGCCG 120 

COGCCGCCAC CACCOCCACC GGAGTCGCG6 GCCAGCCGGG CAGCCTCGGC GQGCCCCGGC 180 

CGGGGCGGGG GGCGCGGGCC ACAGGCCCCT GCTCOGGCCG TQGTTTGCAG ACCGCGGGCG 240 

CCGATGTCGC CC6CGCCCCG TTAGGATGAG TCTCGGGTCG GGCGAGC»GC OOOCGCRGCC 300 

GCCGCCGCCC GAGCCGCGGG CAGGAGCCTC GGGAGCCGCC GCCGCC6CCG COGCCGCOCG 360 

GCCGGGCCCC GACGCCQCCC GCGCGCCCCC GGGCCCCGGA CACACATGAG ATTCTTCAGG 420 

CTCACTTTCA AGTGCTTOQT GGACTGCTTC TGACTGCGCC GCCCGCGCCC CGCACCCCGC 480 

OGTCCGCCCG COQCCCCGTC CCCCGGCCCG GCCGCCGCCC GGCCCCCGGC CGGCCCGGGC 540 

CCTCGGGGCC CTCCCCGOTG CCGCCGGTGC CCCCCGCCTG ACOGCCGCCC CCOGTOAGGC 600 

GCCGCGACCC CGGCCCGGCC GTGCGGCCCG CCGGGGCCAT GGCGAAGAAG AGOGCOGAGA 660 

AOGGCATCTA TAGCGTGTCC GGCGACGAGA AGAAGGGCCC CCTCATCGOG CCCGGGCCCG 720 

AOGGGGCCCC GGCCAAGGGC GAOGGCCCCG TGGGCCTGGG GACACCCGOC GGCCGCCTGG 780 

CCGTGCCGCC GCGCGAGACC TGGACGCGCC AGATGGACTT CATCATGTCO TGCGTQGGCT 840 

TC6CCGTGGG CTTGGGCAAC GTGTGQCGCT TCCCCTACCT OTGCTACAAG AAOGGCGGAG 900 

GTGIGTTCCT TATTCCCTAC GTCCTGATCG CCCtGQTTGO AGOAATCCCC ATTTTCTTCT 960 

TAGftGATCTC GCTGGGCCAG TTCATQAAGG CCQ6CAGCAT CAATGTCTQG AACATCTGTC 1020 

CCCTGTTCAA AGGCCTGGGC TAOGCCTCCA TGGTGATCGT CTT CTAC TGC AACACCTACT 1080 

ACATCATGGT GCTGGCCTGG GGCTTCTATT ACCTGGTCAA GTCCTTTACC ACCACGCTGC 1140 

CCTGGGCCAC ATGTGGCCAC ACCTOQAACA CTOCOGACIG COTGGAQATC TTC0GCCAT6 1200 

AAGACTGTGC CAATGCCAGC CTOGCCAACC TCACCTGTOA CCAGCTTGCT GACC6CCGGT 1260 

CCCCTGTCAT CQAGTTCTGG GAGAACAAAG TCTTGAGGCT GTCTGGGGGA CTGGAGGTGC 1320 

CAGGGGCCCr CAACTGGOAQ CTGAOCCTTT GTCTGCTGGC CTGCTGGGTG CTGGTCTACT 1380 

TCTGTGTCTG GAAGGGGGTC AAATOCACGG GAAAGATCGT GTACTTCACT GCTACATTCC 1440 

CCTAOGTGGT CCTGGTCGTG CTGCTGBTGC GTCGftCTGCT GCTGCCT6GC GCCCTGGATG 1500 

GCATCATTTA CTATCTCAAG OCTGACTGGT CAAACCTGGG OTCOCCTCAO OTSTGOATAG 1560 

ATGCGGGGAC CCAGATTTTC TTTTCTTACG CCATTGGCCT GGGGGCCCTC ACAGCCCTGG 1620 

GCAGCTACAA CCGCTTCAAC AACAACTGCT ACAAG6AC6C CATCATCCTG GCTCTCATCA 1680 

ACAGTGGGAC CRGCTTCTTT GCTGGCTTCQ TGGTCTTCTC CATCCTGGGC TTCATGGCTG 1740 

CAGftGCAGGG CGTGCACATC TCCAAGGTGG CRGACTCAGO GCC3QGGCCTG 6CCTTCATCG 1800 

CCTACCCGCG GGCT6TCAC6 CTGATGCCAO TGGCCCCACT CTOGGCTGCC CTGTTCTTCT 1860 

TCATGCTCTT GCTGCTTGGT CTCQACAGCC -AGTrrGTAGG TGTGGAGGGC TTCATCACCG 1920 

GCCTCCTCGA CCTCCTCCCG GCCTCCTACT ACTTCCGTTT CCAAAGGGAG ATCTCTGTGG 1980 

CCCTCTGTTG TGCCCTCTGC TTTGTCATCG ATCTCTCCAT GGTGACTGAT GGCGGGATX3T 2040 

AOOTCTTOCA GCTOrPTQAC TACTACTCQO OCAGOGGCAC CACCCTGCTC TGGCAGGCCT 2100 

TTTGGGAGTG CGTGGTGGTG QCCTGGGTGT AOGGAGCTGA COGCTTCATG GACGACATTG 2160 

CCTGTATQAT CGGGTACCGA CCTTGCCCCT GGATGAAATG GTGCTGGTCC TTCTTCACCC 2220 

CGCTGGTCTG CATGGGCATC TTCATCTTCA ACGTTGTGTA CTACGAGCCG CTQOTCTACA 2280 

ACAACACCTA CGTGTACCCG TGGTGGGGTG AGGCCATGGG CTGGGCCTTC GCCCTGTCCT 2340 

CCATGCTOTG CCTGCCGCTG CACCTCCTGG GCTGCCTCCT CAGGGCCAAG GGCACCATGG 2400 

CTGAGC6CTG GCAGCAOCTG ACCCAGCCCA TCTGGQGCCT CCACCACTTG GAGTACCGAG 2460 

CTCAGGACGC AGATOTCAGG GGCCTGACCA CCCTGACCCC AGTGTCOGAG AGCAGCAAGG 2520 

TCGTCGTGGT GGAGAGTGTC ATGTGACAAC TCAGCTCACA TCACCAGCTC ACCTCTGGTA 2580 

GCCATAGCAG COCCTGCTTC AGCCCCACCQ CACCCCTCCA GGG66CCTGC CTTTCCCTGA 2640 

CACTTTTGGG GTCTGCCTGG GGGAGGAGGG GAGAAAGCAC CATGAGTGCT CACTAAAACA 2700 

ACTTTTTCCA TTTTTAATAA AACGCCAAAA ATATCACAAC CCACCAAAAA TAGATGCCTC 2760 

TCCCCCTCCA 6CCCTAGC0G AGCTGGTGCT AGGCCCCGCC TAGTGCCCCA CCCCC ACCCA 2820 

CA6TCCTGCA CTCCTCCTGC CCCTGCCACG COCAOCCCCX GCCCACCTCT CCAGGCTCT6 2BB0 

CTCTGCAGCA CACCCGTGGG TGACCCCTCA CCCCR6AAGC AGCAGTGGCA GCTTGGGAAA 2940 

TGTGAGGAAG GGAAGGAGGG AGAGACGGGA GGGAGGAGAG AGAGGAGAAG 6GAGGCRGGG 3000 

GAGGGGCAGC AGAACCAAGG CAAATATTTC AGCTGGGCTA TACCCCTCTC CCCATCCCTG 3060 

TTATAGAAGC TTAGAGAGCC AGCCAGCAAT G6AACCTTCT GGTTCCTGOQ CCAATGGCCA 3120 

CCAGTATCAA TT6TGTGAGC TTGG6TG06A GTGCAC3QCGT GOGTGftGTAC G6AGA0TATA 3180 

TATAGATCTC TATCTCTTAG CAAAGGTGAA TGCCAGATGT AAATOGOSCC TCTGQGCRAA 3240 

GGAGGCTTGT ATTTTGCACA TTTTATAAAA ACTTGAGAGA ATGAGATTTC T6CTTGTATA 3300 

TTTCTAAAAA GAG6AAGGA6 CCCAAACCAT CCTCTCCTTA CCACTCCCAT CCCTGTGAGC 3360 

CCTACCTTAC CCCTCTGCCC CTAGCCAAGQ AGTGTGAATT TATAGATCTA ACTTTCATAG 3420 

GCAAAACAAA AGCTTOSAGC TOTTGCOTOT GTGAGTCTGT TGTGTGGATG TGCGTGTGTG 3480 

OTCCCCAQCC CCAGACTGGA TTGGAAAAGT GCATGGTGGG GGCCTOGGGG CTGTCCCCAC 3540 

GCTGTCCCTT TGCCACAAGT CTGTOGGGCA AGAGGCTGCA ATATTCCGTC CTGGGTGTCT 3600 

GGGCTGCTAA CCTGGCCTGC TCAGGCTTCC CACCCTCTGC GGGGCACACC CCCAGGAAGG 3660 

GACCCTGGAC ACGOCTCCCA CGTCCAGGCT TAAGGTGGAT GCACTTCCCG CACCTCCAGT 3720 
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CTTCTGTGTA GCAGCTTTAA CCCACGTTTG 
TC3ACCGCAA6 AAAG6CTTCX: CCBACACCCk 
G(3GT6G0G66 CCTGGGGG6A. CATTCTACTG 
AAAACATGTC ATTTTCC 

Seg ID NO I 67 Protein sequence: 
Protein Accession #: NP_005620.1 

1 11 21 31 41 51 

I 1 1 ) I i 

MAKKSAEMGI YSVSGDEKKG PLIAPGPDGA PAKGDGPVGL GTPGGRLAVP PRBTKTRQMD 60 

PIMSCVGFAV GLGNVHRPPY LCVKNGGGVP LIPYVLIALV GGIPIFFIiEI SLGQFMKAGS 120 

INVWNICPLP K6LGYASMVI VFYCaJTYYIM VIAWGFYYLV KSPTTTLPWA TCGHTWNTPD IBO 

CVEIFRHEDC ANASLAKI*TC DQLAORRSFV IBFHENKVLR LSOGLEVPGA IiNHEVTLCLL 240 

ACWVLVYPCV WKGVKSTGKI VYFTATPPYV VLWLIiVRGV LIiP<3AIiDGZZ YYUCFDMSKL 300 

GSPQVWIDAG TQIFF8YAIG LGALTALGSY NRFMNNCVKD AIILAItlNSO T8PFAGFWP 360 

SILGFMAAEQ GVHISKVAES GPGLAFIAYP SAVTLMPVAP LHAALFFFML LLLGLOSQPV 420 

GVEGFITGLL DLLPASYYFR FQRBISVALC CALCPVIDLS MVTDGGMWP QhFDYYSASG 480 

TTIiIiWQAFWB CWVAHVYGA DRFMDDIACM I6YRPCFHMK tfCWSFFTPLV CMGZFIFNW 540 

yyEPIiVYNNT yVYPNWGEAM GHAFAXiSSML CVPLBLLGCL LRARGTKABR WQELTQPIWG 600 
UOXLEYRAQO ADVR6DTTLT PVSBSSKWV VESVM 



Seg ID NO: 68 DMA sequence 
Nucleic Acid Accession MM_021953.1 
Coding sequence: 178-2469 

1 11 21 31 41 51 

1 1 I I I I 

GGCACQAGGO G6A0CCGGGC GGTCCQGCQC QAGCCCCGGT COOGOGOCCT GQCTGG6GGC 60 

CCAGGTTGGA G6A6CCC6GA GCCOGCCTTC GGAGCTAGGG GCTAAGG600 60QGCX3ACT6 120 

CAGTCTGGAG GGTCCACACT TGTQATTCTC AATGGAQAGT 6AAAA06CAG ATTCATAATG 180 

AAAGCTAGCC CCCGTCGGCC ACTQATTCTC AAAAGACGGA GGCTGCCCCT TCCTGTTCAA 240 

AATGCCCCAA QTGAAACATC AGAOGAGGAA CCTAAGAGAT GCCCTGCCCA ACAGGAGTCT 300 

AATCAA6CA0 AGGCCICCAA GQAAGTGGCQ GAGTCCAACT CTTGCAAGTT TCGAGCTG66 360 

ATCAAGATTA TTAACCACOC CACCATGCCC AACA06CAAG TAGTGGCCAT CCOCAACAAT 420 

GCTTAATATTC ACA6CATCAT CACAGCACTG ACTGCCAAGG GAAAAGAGRG TGGCAGTAGT 480 

GOGCCCAACA AATTCATCCT CATCAGCTGT GGGGGAGCCC CAACTCAGCC TCCAGGACTC 540 

GG6CCTCAAA CCCAAACCA6 CTAT6AT6CC AAAAGGACAG AAGTGACCCT G GAGAO CTTG 600 

GOACCAAAAC CTGCAGCTAG GGAT6TGAAT CTTOCTAOAC CACCTGOAGC CCTTTGCGAO 660 

CAGAAAGGGG A6ACCTGTGC AGATG6T6AG GCA6CA6GCT 6CACTATCAA CAATA8CCTA 720 

TCCAACATCC AGTGCCTTOS AAAGAT6AGT TCTGATGOAC TGGOCTCCCG CAGCATCAA6 780 

CAACAGATGG AGGAAAAGGA GAATTGTCAC CTGGAGCSVGC GACAOGTTAA GGTTGAGGAG 840 

CCTTOGAGAC CATCAGCGTC CTG6CAGAAC TCTGTGTCTO AGGGGCCACC CTACTCTTAC 900 

ATGGCCATGA TACAATTGGC CSVTCAACA8C ACrGAGAGGA AOaSCATQAG TTTGAAAOAC 960 

ATCTATAOGT GQATTQAGOA CXACTTTCCC TACTTTAAOC ACATTGCCAA GCCAGGCTG6 1020 

AAGAACTCCA TCCGCCACAA CCmCCCTG CACGACATGT TTGTCOGGGA GACGTCTGCC 1080 

AATGGC3U«3G TCTCCTTCTG GACCATTCAC CCCAGTGCC3V ACOGCTACTT GACATTGGAC 1140 

CAG6TGTTTA AGCCACTGGA CCCAGGGTCT CCACAATTGC COGAGCACTT GGAATCACAG 1200 

CA6AAACGAC GGAATCCAGA GCT0C6CCQ6 AACATGACCA TCAAAACGGA ACTGCCCCTG 1260 

GGOGCAOSGC GGAAGATGAA GCXaCTGCTA CCACGGGTCA QCTCATACCT GOTACCTATC 1320 

C3VGTTCCCGG TGAACCAGTC ACTGGTGTTG CAGCCCTOGQ TQAAGGTGCC ATTGCCCCTG 1380 

GCGGCTTCCC TCATGAGCTC AGAGCTTGCC CGCXaTAGCA AGOGAGTCOG CATTQCCCCC 1440 

AAGGTGCTGC TAGCTGAGQA GGGGATA6CT CCTCTTTCTT CTGCAGGACC AQGGAAAGAG 1500 

GASAAACTCC TQmGGAGA AGaGTTTTCT CCTTTQCTTC CAGTTCAGAC TATCAAGGAG 1560 

GAAGAAATOC AQCCTGGGGA GGAAATGCCA CACTTAGCOA GACCCATCAA AGTGGAGAGC 1620 

CCTCCCTTGG AAGAGTOGCC CTCCXXX3GCC CCATCTTTCA AAQAGQAATC ATCTCACTCC 1680 

TGGGAG6ATT OGTCCCAATC TCCCACCCCA AGACCCARGA AGTCCTACAG TGGGCTTAGG 1740 

TCCCCAACCC GGTOTGTCTC GGAAATGCTT GTGATTCAAC ACAGGGAGAG GAGG6AGAGG 180 0 

AGCCQGTCTC GQAGGAAACA GCATCTACTG CCTCCCTGTG TQGATGAGCC GGAGCTGCTC 1860 

TTCrCAGAGQ QGCCCAGTAC TTCCCGCTGG GCCX3CAGAGC TCCOGTTCXX: AGCAGACTCC 1920 

TCTGACCCTG CCTCCCAGCT CAGCTACTCC CAGGAAGTG6 GAG6ACCTTT TAAGACaWTCC 1980 

ATTAAOGAAA CGCTGGCCAT CTCCTCCACC CGGAGCAAAT CTGTCCTCGC GAQAACCCCT 2040 

GAATCCTGGA GGCTCACGCC CCCAGCCAAA GTAGGGGGAC TGGATTTCAO CCCAGTACAA 2100 

ACCTCCCAGG GTGCCTCTGA CCCCTTGCCT QhCCCCCTOQ GQCTGATGGA TCTCAGCACC 2160 

ACTCCCTTGC AAAGT6CTCC CCCCCTTGAA TCACCQCAAA GGCTCCTCAG TTCAGAACCC 2220 

TTAGACCTCA TCTCXX3TCCX: CTTTGGCAAC TCTTCTCCCT CAGATATAGA OGTCCGCAAG 2280 

CCAGGCTCOC OSGAGOCACA G G TTT C IGGC CTTGCAGCCA ATOSTTCTCT 6AGA6AAGGC 2340 

CTGGTCCTGO ACACAATGAA TGACAQCCTC AGCAAOATGC TQCT6QACAT CAGCTTTCCT 2400 

GGCCTGGAOG AGGACCCACT GGGCCCTGAC AACATCAACT GGTCCCAGTT TATTCCTGAG 2460 

CTACAGTAGA GCCCT6CCCT TGCCCCTGTQ CTCAAGCTGT CCACCATCCX: GGGCACTCCA 2520 

AGGCTCAGTG CAOCCCAAGC CTCTGAGTGA 66ACAGCAGG CAGGGACTGT TCTGCTCCTC 2580 

ATAGCTCCCT GCTGOCT GA T TATGCAAAAG TAGCAGTCAC ACCX:TAGCCA CTGCTGGGAC 2640 

CTT6TGTTCC OCAAGAGTAT CTOATTOCTC TGCTGTCOCT GCCAGGAGCT QAAGGGTGGG 2700 

AACAACAAAQ GCAATGGTGA AAAGAGATTA GGAACCCOCX: AGCCTGTTTC CATTCTCTGC 2760 

CCAGCAGTCT CTTACCTTCC CTGATCTTTG CAGGGTGGTC CGTGTAAATA GTATAAATTC 2820 

TCCAAATTAT CCTCTAATTA TAAATGTAAG CTTATTTCCT TAGATCATTA TCCA6AGACT 2880 

GCCAGAAGGT GGGTAGGAT6 ACCTGGGGTT TCAATTGACT TCTGTTOCTT OCTTTTAOtT 2940 

TTGATAGAAG GGAAGACCT6 CAGTGCACGG TTTCTTCCAG GCTGAGGTAC CTGGATCTTQ 3000 

GGTTCTTCAC TGCAGGGACC CAQACAAQTG GATCTGCTTG CCAGAGTCCT TTTTGCCCCT 3060 

CCCTGCCACC TCCCCGTGTT TCCAAGTCAG CTTTCCTGCA AGAASAAATC CTGGTTAAAA 3120 

AAGTCTTTTG TATrGGGTCA GGAGTT6AAT TTGGGGTGGG AGGATGGATG CAACTGAAGC 3180 

AGA8TGTGGQ 1GCCCAGATG TGOGCTATTA GATGTTTCTC TQATAATQTC CCCAATCATA 3240 

CCAGGGAQAC TGGCATTGAC GAQAACTCAO QIGOAGGCTT GAGAAGGCOG AAAGG6CCCC 3300 

TGACCTGCCT GGCTTCCTTA GCTTGCCCCT CAGCTTTGCA AA6AGCCACC CTAGGCCCCA 3360 

GCTGACCGCA TGGGTGTGAG CCAGCTTQAG AACACTAACT ACTCAATAAA AGCX3AA6GTG 3420 
GACaiAAAAA AAAAAAAAAA AAAA 



TCTGTCAOGT CCAGTCXX3GA GACX3GCTGAG 3780 
GACAGAGGCT GCAGGGCTOO GGCTSaGTOA 3840 
TGCTAAAAAO CXACTGCAOA CATAOCAATA 3900 
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Seq ID NOt 69 Protein sequence i 
Protein Accession #t NP 068772.1 
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41 

1 


51 
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I 

MKASPRRPIiI 


1 

LRRRRLPLPV 


1 

QNAPSSTSEB 


1 

EPKRSPAQQB 


1 

SNQABASKEV 


1 

AESNSCXFPA 


60 


6IKIINHPTM 


PNTQWAIPN 


NANZHSIITA 


LTAR6KE86S 


S6PZIKFILIS 


0GGAPTQPP6 


120 


LRPQTQTSYD 


AKRTEVTLET 


LGPKPAARDV 


NLPRFPGALC 


EQKRBTCAD6 


EAAGCTINNS 


180 


LSNIQWLRKM 


SSDGLGSRSl 


KQEMEEKENC 


HLEQRQVKVE 


EPSRPSASiiQ 


NSVSERPPYS 


240 


YKAMIQFAIN 


STERKRMTLK 


DIYTWIEDHP 


PYFKHIAKP6 


WKNSIRBNLS 


IiHDMFVRETS 


300 


ANGKVSFNTI 


HPSANRYLTL 


DQVFKPLDPO 


SPQLPEHLSS 


QQKRPNPEXiR 


RNKTIXTEIiP 


360 


LQARRKMKPL 


LPRVSSYLVP 


ZQFPVMQSLV 


LQPSVKVPLP 


LAASLMSSEL 


ASHSKRVRIA 


420 


PKVLLAEBGZ 


APLSSAGP6K 


EEKLLFGEGF 


SPLLPVQTIK 


EBEIQPGEEM 


PHLARPIKVE 


460 


SPPLBEWPSP 


APSFKEESSH 


SWED5SQSPT 


PRPKKSYSGL 


RSPTRCVSEM 


IjVIQHRERRE 


540 


RSRSRRKQHIi 


LPPCVDBPEL 


IiPSEGPSTSR 


WAAELPFPAD 


SSDPASQLSY 


SQEVGGPFKT 


600 


PIKETIiPISS 


TPSKSVLPRT 


PSSMRLTPPA 


KVGGLDFSPV 


QTSQGASDPL 


FDPLGIiMDLS 


660 


TTPLQSAPPL 


ESPQRLLSSS 


PIDLISVPFG 


N8SPSDIDVP 


KPGSPBPQVS 


GLAANRSLTE 


720 


GLVWnWDS 


LSKZIiLDISF 


PGLDEDPL6P 


ONZNWSQPZP 


ELQ 







Seq ID KOi 70 DHA sequence 

Nucleic Acid Accession #: BC006529.1 

Coding sequence: 176-2424 

1 11 21 31 41 51 

I I I I ] i 

GGCACGAGGG GGACCOGGCC GGTCCX3GCGC GAGCCCCC3GT CCGGGGCCCT GGCTCGGCCC 60 

CCAGGTTGGA GGAGCCCGGA GCCOGCCTTC GGAGCXACGG CCTAAOSGCG GCGGCGACTG 120 

CAGTCTGGAG GGTCCACACT TGTGATTCTC AATGGA6AGT GAAAACQCA6 ATTCATAATG 180 

AAAACTAGCC CCCGTCX3GCC ACTGATTCTC AAAAGAC6GA GGCTGCCCXTT TCCTGTTCAA 240 

AATGCCCCAA GTQAAACATC AGAGGAGGAA CCTAAGAGAT CCCCTGCCCA ACAGGAGTCT 300 

AATCAAGCAG AGQCCTCXZAA GGAAGTGGCA GAGTCCAACT CTTGCAAGTr TCCAOCTGGG 360 

ATCAAGATTA TTAACCACCC CACCATGCCC AACACGCAAG TAGTGGCCAT CCCCAACAAT 420 

6CTAATATTC ACAGCATCAT CACASCACTG ACTGCCAA60 GAAAAGA6A6 TG6CAGTAGT 480 

6C36CCCAACA AATTCATCCT CATCA6CTGT 66GGGAGCCC CAACTCAGCC TCCAGGACTC 540 

CGGCCTCAAA CCCAAACCAG CTATGATGCC AAAAGGACAG AAGTGACCCT GGAGACCTTG 600 

GOACCAAAAC CTGCAGCTAG QQATGTGAAT CTTCCTAGAC CACCTGGAGC CCTTTGCGAG 660 

CA6AAACGGG AQACCTGTGC AGATGGTGAG GCAGCAG6CT GCACTATCAA CAATAGCCTA 720 

TCCAACATCC A6T66CTTG6 AAAGATGAGT TCTGATG6AC TGOGCTCC06 CAGCATCAAG 780 

CAAGAGATGG AGGAAAAGGA GAATTGTCAC CTGGAGCAGC GACAGGTTAA GGTTGA6SAG 840 

CCTTOSAGAC CATCAGCGTC CTGGCAGAAC TCTGTGTCTG AGCGGCCACC CTACTCTTAC 900 

ATGGCCATGA TACAATTCGC CATCftACAGC ACTGAGAGGA AGCGCATGAC TTTQAAAGAC 960 

ATCTATAC6T GGATTGAGGA CCACTTTCCC TACTTTAAGC ACATTGCCAA 6CCAG6CTGG 1020 

AAGAACTCXA TCCGCCACAA CCTTTCXCTG CACGACATOT TTGTCGGGQA GAGQTCIGCX: 1080 

AATQGCAAGG TCTCCTTC1X3 GACX:ATTCAC CGCAGTGCX31 AC06CTACTT GACATT6GAC 1140 

CAGGTGTTTA AGCAGCAGAA ACGACCXSAAT CCA6AGCTCC GCCGGAACAT GAOCATCAAA 1200 

ACC6AACTCC CCCTGGGCGC ACQGCGGAAG ATGAAGCCAC TGCTACCACG GGTCRGCTCA 1260 

TACCTG6TAC CTATCCAGTT CCOSGTQAAC CAGTCACTtSG TGTTGCAGCC CTCGGTGAAG 1320 

GTGCCATTQC CCCT6G0QQC TTCCCXOVIO AlGCTCAQAGC TTGCXICGCCA TAGCAA6C6A 1380 
OTCOGCATTG CCCCCAAGQT GCTGCTAGCT GAGGAGQGGA TAGCTCCTCT TTCTTCTQCA • 1440 

GGACCAGGGA AAGAGGA3AA ACTCCTGTTT GGAGAAGGGT TTTCTCCTTT GCTTCCAGTT 1500 

CAQACTATCA AGGAGGAAGA AATCCAGCCT GGGGAGGAAA TGCCACACTT AGOGAGACCC 1560 

ATCAAA6TGG AGA6CCCTCC CTTGGAAGAG TGGCCCTCCC CGGCGCCATC TTTCAAA6AG 1620 

6AATCATCTC ACTCCTGGQA GOATTOGTCC CAATCTCCCA CGCGAAOACC CAAGAAOTCC 1680 

TACAOrGGGC TTAGGTCCCC AACC0G6TGT 6TCTCG6AAA TGCrTOTGAT TCAACACAGG 1740 

GAGAGGAGGG AGAGGAGCCX3 GTCTCGGAGG AAACAGCATC TACTGCXTTCC CTGTGTGGAT 1800 

GAGCCX3GAGC TGCTCTTCTC AGAGGGGCCC AGTACTTCCC GCTGGGCCGC AGAGCTCCCG 1860 

TTCCCAGCAG ACTCCTCTGA CCCTGCCTCC CAGCTCAGCT ACTCCCAG6A AGTGGGAGGA 1920 

CCTTTTAAGA CACCCATTAA GGAAACGCTG CCCATCTCCT CCACCCCXSAG CAAATCT6TC 1980 

CTCCCCAGAA CCCCTGAATC CTGGAG(3CTC AGGCCCXTCAG CCAAAGTAGG 666ACTGGAT 2040 

TTCAGCCCA6 TACAAACCCC CCAGGGTGCC TCTGACCCCT TGCCTGACXX: CCTGGGGCTQ 2100 

ATGGATCTCA GCACCACTCC CTTGCAAAGT GCTCCXrCCCC TTGAATCACC GCAAAG6CTC 2160 

CTCAGTTCAG AACCCTTAGA CCTCATCTCC GTCCGCTTTG GCAACTCTTC TCCCTCAGAT 2220 

ATAGAOGTCC CCAAGCCAGG CTCCCCGGAG CCACAGGTTT CTGGCCTTGC AGCCAATCGT 2260 

TCTCTGACAG AAGGCCTGGT CCTGGACACA AT6AATGACA GCCTCAGCAA GATCCTGCTG 2340 

GACATCAGCT TTCCTGGCCT GGACiGAGGAC CCACT6GGCC CTGACAACAT CAACTGGTCC 2400 

GASTTTATTC CT6AGCTACA 6TAGAGOXT GCCCTTGCCC CTQTGCTCSUl GCTOTOCACC 2460 

ATCCCG6GCA CTCCAAGGCT CAGT6CACCC CAAGCCTCTG AGTGAGGACA GCAGGCAGGG 2520 

ACTGTTCTGC TCCTCATAGC TCCCTGCTGC CTGATTATGC AAAAGTAGCA GTCACACCCT 2580 

AGCCACTGCT GGGACCTTGT GTTCCCC3U^G AGTATCTGAT TCCTCTGCTG TCCCTGCCAG 2640 

GAGCTGAAGG GTGGGAACAA CAAAG6CAAT GGTGAAAAGA 6ATTAGGAAC CCCCCAGCCT 2700 

GTTTCCATTC TCTGGCCAGC AGTCTCTTAC CTTCCCIOAT CTTTGCAGGG T6GTCCGTGT 2760 

AAATAGTATA AATTCTCCAA ATTATCCTCT AATTATAAAT GTAAGCTTAT TTCCTTAGAT 2820 

CATTATCCAG AQACTGCCAG AAQGTGGGTA GGATGACCTG GGGTTTCAAT TGACTTCTGT 2880 

TCCTTQCTTT TAGTTTTGAT AGAAGGGAAG ACCTGCAGTG CAC53GTTTCT TCCAGGCTGA 2940 

GGT ACCTGGA TCTTGGGTTC TTCACTGCAG GG ACCC AGAC AAGTGGATCT GCTTGCXaGA 3000 

GTCCTTTTTG GCCCTCCCTG CGACCTGOCC GTGTTTCCAA GTCAQCTTTC CrGGAAOAAG 3060 

AAATCCTGGT TAAAAAACTC TTTTGTATTG G6TCAGGAGT TGAATTTGG6 GTGGGAGQAT 3120 

GGATGCAACT GAAGCAGAGT GTGGGTGCCC AGATGTGCGC TATTAGATGT TTCTCTGATA 3180 

ATGTCCCCAA TCATACCAGG GAGACTGGCA TTGACGAGAA CTCAGGTGGA GGCTTGAGAA 3240 

GGCGGAAAGG GCCCCTGACC TGCCTG6CTT CCTTAGCTTG CCCCTCA6CT TTGCAAAGAG 3300 

OCACCCTAGG CCCCAGCIGA CG6GATGQ6T GTGAGGCSISC TTGA6AACAC TAACTACTCA 3360 
ATAAAA60QA AGGTGGAAAA AAAAAAAAAA AAAAAAA 

Seq ID NO: 71 Protein sequence: 
Protein Accession #: AAH06529.1 
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I 11 21 31 41 51 

II I • I I 1 

MKTSPSRPLI LKRRRLPLPV QKAPSETSEE EPKRSPAQQE SKQAEASKEV AESNSCKFFA 60 

OIKIINHPTM PNTQWAIPN NANIHSIITA LTAKGKESGS SGPNKFILIS CGGAPTQPPG 120 

LRPQTQTSYD AXRTBVTLET LOPKPAARDV NLPRPPGALC EQKRETCADO BAAGCTINMS ISO 

LSNIOWLRKH SSDQL6SRSI KQEHBEKEHC RLEQRQVKVB EPSRPSASWQ HSV8ERFPYS 240 

yMAMJQFAZK STBRKRMTLK DIYTWZGDHF PYFKHIAKPG HKNSIRHNLS LHDMFVRETS 300 

ANGKVSPWTI HPSAKRYLTL DQVPTCQQKRP NPELRHNMTI KTELPLGARR KMKPLLPRVS 360 

SYLVPIQFPV NQSLVLQPSV KVPLPLAASL MSSELARHSK RVRIAPKVLL AEEGIAPIjSS 420 

AGP6KEBKLL FGEQFSFLLF VQTIKEEEIQ P6EEMPHLAR PXKVESPPLE EWPSFAPSFK 480 

BESSKSffEDS SQSPTPRPKK SY5GLRSPTR CV5EMLVIQH RERRER5R5R RKQRLLPPCV 540 

DEPELLFSEG PSTSRNAABL PFPADSSDPA SQLSYSQEVG GPFKTPIKBT LPISS7PSKS 600 

VLPRTPBSWR LTPPAKVG6L DFSPVQTPQG ASDPLPDPIjO LMDLSTTPLQ SAPPLESFQR 660 

LLSSEPLDIiI SVPFGNSSPS DIDVPKPGSP BPQVSGLAAN RSLTEaLVU) TMNDSLSKIIi 720 
U3ISFPGLDE DPLGPDHINW SQFZPBLQ 



Seq ID NOt 72 DNA sequence 
Nucleic Acid Accession #: U74612.1 
boding sequence: 178-2583 

1 11 21 31 41 51 

1)11)1 

GGCACGAOGG GGACCOQGCC G6T00860SC QAGCCCCOGT 00C3GGGG0CT GOCTCBOOCC €0 

CCAGGTTGGA GGAGCCCGGA GCCCGCCTTC GGAGCTACGG CCTAACGGCG QCGGOQACTG 120 

CAGTCTGGAG GGTCCACACT TGTGATTCTC AATQGAOAGT GAAAACOCAG ATTCATAATG 180 

AAAACTAGCC CCCGTCGGCC ACTOATTCTC AAAAGAOGGA QQCTGCCCCT TCCTGTTCAA' 240 

AATGCCCCAA GTGAAACATC AGAGGAGGAA CCTAAGAQAT CCCCTGCCCA ACAGGAGTCT 300 

AATCAAGCAO AfiGCCTCXAA GOAAQTGGCA OASTOCAACT CTTOCAAGTT TCCAGCTGG6 360 

ATCAAQATTA TTAACCACCC CACCAT6CCC AAGA06CAAG TA6TGGCCAT CCCCAACAAT 420 

OCTAATATTC ACAGCATCAT CACACCACTG ACTGCCAAGG GAAAAGAGAG TGGCAGTAGT 480 

GGGCCCAACA AATTCATCCT CATCAGCTGT GGGGGAGOCC CAACTCAGCC TCCAGGACTC 540 

CX3GCCTCAAA CCCAAACCAO CTAT6ATGGC AAAAGGACAG AAGTGACCCT G GAGAC CTTG 600 

GGACCAAAAG CTGCftOCTAS GGATCTGAAT CTTCCXAGAC CACCTGGA6C CCTTTGOGAG 660 

CAGAAACGGG AGACCTGTGC AGATGGTGAG GCAGCAGGCT GCACTATCAA CAATAQCCTA 720 

TCCAACATCC AGTGGCTTCG AAAGATGAGT TCTGATGGAC TGGGCTCCOG CAGCATCAAG 780 

CAAGAGATGG AGGAAAAGGA GAATTGTCAC CTGGAGCAGC GACAGGTTAA GGTTQAGGAG 840 

CCTTCGAGAC CATCAGCGTC CTGGCAGAAC TCTGTGTCTO AGOSGCCACC CTACTCTTAC 900 

AIG8CCATGA TACAATTGGC CATGAACASC ACIIGAOAOQA A80SGA3GAC TTTGAAAOAC 960 

ATCTATAOOT GQATTQAGGA CCACTTTCCC TACITTAAGC ACATTGCOiA GCCAGOCTGG 1020 

AAGAACTCCA TOCX3CCACAA CCTTTCCCTO caCCJACATOT TTGTCCGGGA GAOGTCTGCC 1080 

AATGQCAAGG TCTCCTTCTG GACCATTCAC CCCAGTGCCA ACOGCTACTT GACATTGGAC 1140 

CAGGTGTTTA AGCCACTGGA CCCAiGGGTCT CCACAATTGC COGAGCACTT GGAATCACAG 1200 

CACaAAAGQAC GQAATCCAC2A GCTGCX3C0GG AACATGACCA TCAAAACOBA ACTCCCCCTG 1260 

GGOGCACGGC GQAAGATGAA GCCACTGCTA CCA06GGTCA GCTCATACCT GGTACCTATC 1320 

CAGTTCCCGG TGAACCAGTC ACTGGTGTTG CAGCCCTC3GG TGAAGGTGCC ATTGCCCCTG 1380 

GOSGCTTCCC TCATGAGCTC AGAGCTTGCC CGCCATAGCA AGOGAGTCCG CATTGCCCCC 1440 

AAGGTTTTTG GGGAACAG6T GGTGTTTGGT TACAT6AGTA AaTTCTTTAG TGGOSATCTG 1500 

OGAGATTTTG GTACACCCAT CACCAGCTTG TTTAATTTTA TC IT A ' C I T TG TTTATCAGTG 1560 

CTGCTAGCTG AGGAGGGGAT A6CTCCTCTT TCTTCTGCAG GACCAG6GAA AGAGOAGAAA 1620 

CTCCTGTTTG QAGAAGGGTT TTCTCCTTTG CTTCGAGTTC AGACTATCAA GGAGGAAGAA 1680 

ATCCAGCCTG GGGAGGAAAT GCCACACTTA GOSAGAOCCA TCAAAGTGGA GAGCOCTCCC 1740 

TT6GAAGAGT GGCCCTCCCC GGCOCCATCT TTCAAAGAG6 AATCATCTCA CTCCTGGGAG IBOO 

6ATTC9STCOC AATCTCCCAC CCCAAOACCC AAGAAGTCCT ACAGTGGGCr TAGGTCCCCA 1860 

ACCCGGTGTO TCTGGOAAAT 6CTTGTGATT CAACACAGG6 AGAOGAGGGA 6AGGAGCGGG 1920- 

TCTCGQAG6A AACAQCATCT ACTOCCTCCC TGTOTGGATG AGCOGGAGCT GCTCITCTCA 1980 

GAGGGGCCCA QTACTTCCCG CTGGGCOSCA QAGCTCCCX3T TCCCAGCAGA CTCCTCTGAC 2040 

CCTGCCTCCC AOCTCAGCTA CTCCCAGGAA GTGGGAGGAC CTTTTAA6AC ACCCATTAAG 2100 

OAAAGQCTGC CCATCTCCTC CACCCOGAGC AAATCTGTCC TCCCCAGAAC CCCTQAATCC 2160 

TOGAGGCTCA OGCCCCCAOC CAAAGTAGGG GGACTGGATT TCAGCCCAGT ACAAACCTCC 2220 

CAGGGTGCCT CTGACCCCTT GCCTGACCCC CTGGGGCTGA TGGATCTCAG CACCACTCCC 2260 

TTGCAAAGTG CTCCCCCCCT TGAATCACOG CAAAGGCTCC TCAGTTCAGA AOCCTTAGAC 2340 

CTCATCTCCG TCCCCTTTGG CAACTCTTCT CCCTCAQATA TAQAOGTCCC CAAGCCAGGC 2400 

TCXCCGGAGC CACAGGTTTC TOGCCTTQCA GCCAATCGTT CTCTGACAGA AGGOCTGGTC 2460 

CTGGACACAA TCAATQACAG CCTCAGCAAG ATOCTGCTGG ACATCAGCTT TCCTGGCCTG 2520 

GACGAGGACC CACTGGGCCC TGACAACATC AACTGGTGCC AGTTTATTCC TGAGCTACAG 2580 

TAGA6CCCTG CCCTTGCCCC TGTGCTCAAQ CTGTCCACCA TOCOSOGCftC TOCAAGGCTC 2640 

AGTGCACCCC AAGCCTCTGA GTQAGGACAG CAGGCAGG6A CTQTTCTGCT CCTCATAGCT 2700 

CCCTGCTGCC TGATTATGCA AAAGTAGCAO TCACACCCTA GCCACTGCTG GGACXnTGTG 2760 

TTCCCXauVGA GTATCTGATT CCTCTQCTGT CCCTGCCAGG AGCTGAAGGG TGGGAACAAC 2820 

AAAGGCAATG GTGAAAAGAG ATTAGGAACC CCCCAGCCIG TTTCCATTCT CTGCCCAGCA 2880 

GTCTCTTACC TTCCCTGATC TTTGCAQGGT GGTC0ST6TA AATAGTATAA ATTCTCCAAA 2940 

TTATCCTCTA ATTATAAATG TAAGCTTATT TCCTTAGATC ATTATCCAGA GACTGCCAGA 3000 

AGGTGGGTAG GATGACCTGG GGTTTCAATT GACrTCTGTT CCTTGCTTTT AGTTTTGATA 3060 

GAAGG6AAGA CCTGCAGTGC ACGGTTTCTT CCaGGCTGAG GTACCTGGAT CrTGGGTTCT 3120 

TCACTGCA6G 6AC0CAGACA AGT6GATCTG CTTGCCAGAG TCCTTTTTGC CCCTCCCTGC 3180 

CACCTCCCCO TGTTTCCAAG TCAGCriTCC TGCAAGAAGA AATOCTGGTT AAAAAAGTCT 3240 

TTT6TATTGG GTCAGGAGTT GAATTTGGGG TGGGAGGATG GATGCAACTG AAGCAGAGTG 3300 

TGGGTGCCCA GATGTGOGCT ATTAGATGTT TCTCTGATAA TGTCCCCAAT CATACCAGGG 3360 

AGACTGGCAT TGAOSAGAAC TCAGGTGGAG GCTTGAGAAG GCOQAAAGGG CCCCTGACCT 3420 

GCCTGGCTTC CTTAOCTTGC CCCTCAGCTT TGCAAAGAGC CACCCTA66C CCCA6CTGAC 3480 

CGCATGG8T0 TQAGCCAGCT TGAGAACACT AACTACTCAA TAAAAGOGAA GGTGGACAAA 3540 
AAAAAAAAAA AAAAA 

Seq ID NO: 73 Protein sequence: 
Protein Accession «: AAC51128.1 
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1 11 21 31 41 51 

i i I I I I 

^ MKTSPRRPLI LKRRRLPLPV QNAPSBTSEE EPKRSPACX3E SNQT^EASKBV ABSUSCKFPA 60 

J GIKIINHPTM PNTQWAIPN NANIHSIITA LTAKGKESGS SGPNKPILIS OGGAPTQPPG 120 

LRPQTQTSTO AKRTEVTLET LGPKPAARDV NI.PRPPGALC EQKRBTCADG EAAGCTIMNS 180 

LSHXQWIiRKN SSDGI^SRSI KQEMBEKEtIC HLEQRQVKVE EPSRP5ASWQ NSVSERPPYS 240 

YMAMIQPAIN STERKRMTLK DIYTWIEDHP PYFKHIAKPG WKNSIRHNLS LHDMFVRETS 300 

^ ANGKVSPWTI HPSANRYLTL DQVFKPLDPG SPQLPEHLES QQKRPNPELR RNMTIKTELP 360 

lU LGARRKMKPL LPRVSSYLVP IQFPVNQSLV LQPSVKVPLP IiAASUISSEL ARHSKRVRIA 420 

PXVFGEQWF GYMSKPFSGO IiRDFGTPITS LFNFIFLCLS VLZAEEGIAP LSSAGPGKEE 480 

KUiFOEGFSP LLFVQTIKBB EIQPGBEMPa lARPXKVESF PLBBWPSPAF SFKEBSSHSW 540 

BDSSQSPTFR PKKSYSGLRS PTRCVSEMLV IQHRESRSRS RSRRKQHIiLP PCVDEPBLLF 600 

- _ SEGPSTSRWA AELPFPADSS DPASQLSYSQ EVGGPFKTPI KETLPISSTP SKSVLPRTPE 660 

SMRLTPPAKV GGLDFSPVQT SQGASDPLPD PLGLMDLSTT PLQSAPPLES PQRLLSSEPL 720 

DLISVPF6HS SPSDIOVPKP GSPEPQVS6L AANRSLTEGIi VUXTMNDSIiS KIIiIiDISFPG 780 
LDEDPL6PDN UnfSQPIPEL Q 

Seq ID NO: 74 DNA sequence 
^\) Nucleic Acid Accession Eos sequence 
Coding sequence: 111-416 
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1 11 31 31 41 51 

i I I I I I 

GGGAAGAGCC AGGCTGAGCC TTATAAAGGA CTGCTCTTTG TCCAAACACA CACATCTCS^C 60 

TCATCCTTCT ACTOGTGAOG CTTCCCAGCT CTGGCTTTTT GAAAGCAAAQ ATGA6CAACA 120 

CTCAAGCTGA 6AGGTCCATA ATAGGCATGA 7CGACATGTT TCACAAATAC ACChGAOGTG 180 

ATGACAAGAT TGAGAAGCCA AGCCTGCTGA CX3ATGATGAA GGW3AACTTC OCCAACTTCC 240 

TTAOTGCCTG TGACAAAAAG G6CACAAATT ACCTCGCCGA TOTCTTTGAQ AAAAAGGACA 300 

AGAAT6AGGA TAAGAAGATT GATTTTTCTG AGTTTCTGTC CTTGCTGOOA GACATAGCCA 360 

CAGACTACCA CAAGCAGAGC CATGGAGCAG CGCCCTGTTC GGGGGGCAGC CAGTGAGCXIA 420 
GCCCCACCAA T6GGCCTCCA GAGACGCCA6 GAACAATAAA ATGTCTTCTC CCACCAGA 



^- Seq ID NO: 75 Protein sequence t 

Protein Accession Ut Boa sequence 

1 U 21 31 41 51 

'1 I I I 1 1 - 

MSNTQAERSI IGMIOMPHKY TRRBDKIEKP SLZiTNMKENF PNFLSACDKK GTNYIiADVFB 
4U KRDKNEDKKI DFSEPLSLLG DZATDYHKQS BGAAPCSGGS Q 

Seq ID NO: 76 DNA sequence 
. - Nucleic Acid Accession #: Eos sequence 
43 Coding sequence: 111-416 



1 11 21 31 41 51 

I I 1 1 i I 

GGGAAGAGCC AGGCTGAGCC TTATAAAGGA CTGCTCTTTG TCCAAACACA CACATCTCAC 60 

TCATCCTTCT ACTOSTGACA CTTCCGRGTr CrGGCTTTTT GAAAGCAAA6 ATGAGCAACA 120 

CTCAAGCTGA GAOGTOCATA ATAGGCATQA T06ACATGTT TCACAAATAC ACC6GAC6T0 180 

ATGGCAAGAT TGAGAAGCCA AGCCTGCTGA CGATGATGAA GGAGAACTTC CCCAATTTCC 240 

TCAGTGCCTG TGACAAAAAG GGCATACATT ACCTCGCCAC TGTCTTTGAG AAAAAGGACA 300 

AGAATGAGGA TAAGAAGATT GATTTTTCTG AGTTTCTGTC CTTGCTGGGA OACATAGOCG 360 

CAGACTACCA CAAGCAGAGC CATGGAGCGG CGCCCTGTTC TGGGGGAAGC CAGTGATGCA 420 
GCCCCACCAA GGGGCCTCCA GA6ACCCCAG GAACAATAAG TGTCTCCTCC CACCA6A 

Seq ID NO: 77 Protein sequence: 
Protein Accession #: XP_04S124.1 

1 11 21 31 41 51 

I I I I I i 

MSNTQAER8I IGHIOMFHKy TGSDGKIBKP SLLTMMRSNF PNFLSACDKR GimTLATVPE 60 
XKDKNEDKKI DFSEPLSLLG DIAADYHSQS BGAAPCSGGS Q 

Seq ID NO: 78 DNA sequence 
Nucleic Acid Accession ft: Z73678.1 
Coding sequence: 253-2433 

1 11 21 31 41 51 

i I I I I I 

GGGGTQGTGC AGGGCAGGGG TGGTATATCC TGTCTGACGG AGGGCGGGCC TGGCCAGTGC 60 

CAGAGA6GGA CGAACCAGOG TGGAA6GGCC AGGAGCAGCT GCAGGGA6CC CTCAGGCGGA 120 

CCTOGCACTC TATGGCCGTA GG6A6C0GCT GAGAGCGAGA A6AGCA0GCT CCTGCCOGCC 180 

OGCTGCACCG CACCTCGCCT CGCCTCTCTQ CTCTCCTAGG CCCCGGCCGC GCGCCACCCG 240 

CCTCCCGCCA CCATGAACCA CTCGCCGCTC AAGACOGCCT TGGCGTACGA ATGCTTCCAG 300 

QACCAGGACA ACTCCACGTT GGCTTTGCCG TCGGAOCAAA AQATGAAAAC AOGCAOQTCT 360 

GGCAGGCAGC GCGTGCAGGA GCAGGTGATG ATGACOGTCA AGCG6CAGAA GTCCAAGTCT 420 

TCCCAGTCGT CCACCCTGAG CCACTCCAAT CGAGGTTCCA TGTATGATGG CTTGGCTGAC 480 

AATTACAACT ATGG6ACCAC CAGCAGGAGC AGCTACTACT CCAAGTTCCA GGCAGGGAAT 540 

GGCTCATGGG GATATCOGAT CTACAATGGA ACCCTCAAGC GGGAGCCTGA CAACAGGCGC 600 

TTCAGCTCCT ACAGQCAGAT GGAGAACTGG AGCCGGCACT ACCCCCGGGG CAGCTGTAAC 660 

ACCACCGQCG CAGGCAGCGA CATCTGCTTC ATGCAGAAAA TCAAGGOGAG COGCAGTGAG 720 

CC06ACCTCT ACTGTGACCC ACGGGGCACC CTGCGCAAGG GCAGGCTGGG CAGCAAGGGC 780 

CAGAAGACCA CCCAGAAOCG CTACAOCTTT TACA6CACCT GCA6TGGTCA GAAG6CCATA 840 

AAGAAGTQGC CT6TQG6GCC GCGCTCTTV3T GCCTCCAAGC AGGACCCTGT GTATATCCCG 900 
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CCCATCTCCT GCAACAAGGA CCTGTCCTTT GGCCACTCTA GGGCC AGCTC CAAGATCTGC 960 

AOTOAOGAGA TOQAGTGCAG T6G6CTGA0C ATCCCCAAGO CTGT6CAGTA CCIGASCTCC 2030 

CAGOATGAGH AGTHCXAGGC CATTGGG6CC TATTACATCC AGCATACCT6 CTTCCASGAT lOSO 

GAATCTGCCA AGCAACAGGT CTATCAGCTQ GGA6GCATCT GCAA6CTGGT GGACGTCCTC 1140 

CX5CAGCCCCA ACCAGAACXST CCAGCAGGCC GCGGCAGGGG CCCT6CGCAA CCTGGTGTTC 1200 

AGGA6CACCA CCAACAAGCT GGAGACCCGG A6GCAGAATG GGATCCGCGA GGCAGTCAGC 1260 

CIGCIGAGGA GAACGGG6AA CXSCCGAOATC CAGAAOCAaC TGACTGGGCT GCTCTGGAAC 1320 

CT6TCTTCCA CTGA0GA6CT GAAGGAGGAA CTCATTOCGG AOGCCCTGCC TGTTCTGGCC 1380 

GACCGCGTCA TCATTCCCTT CTCTOGCTGO TGCQATGGCA ATAGCAACAT GTCCCGGGAA 1440 

GTGGTGGACC CTGAGGTCTT CTTCAATGCC ACAGGCTGCT TGAGGAACCT GAGCTCqGCC 1500 

GATGCAQGCC GCCAGACCAT GQGTAACTAC TCAGGGCTCA TTGATTCCCT CATGGCCTAT 1560 

QTCCAGAACT GTGTAGOOaC CAgGGGCTGT 6AGGACAAGT CTGTGGAAAA CTGCATGTGT 1620 

GTTCTGCACA ACCTCTCCTA CCGCCTGOAC GCCX5AQGTGC CCACCCGCTA COGCCAGCTG 1680 

QAGTATAACG CCC3GCAACGC CTACACCGAG AAGTCCTCCA CTGGCTGCTT CAGCAACAAG 1740 

AGCOACAAGA TGATGAACAA CAACTATCSAC TGCCCCCTGC CPGAGGAAGA GACCAACCCC 1800 

AAGGGCAGOG GCTGGTTGTA CCATTCAGAT GCCATCCGCA CCTACCTGAA CCn-CATGOGC 1860 

AAGAGCAAGA AAGATGCTAC CCIGGAG6CC T6TGCTGGTG CCCT6CA6AA CCTGACAGCC 1920 

AGCAAG6GGC TGATGTCCAG TGGCATGA6C CAGTTQATTO GGCTGAAOGA AAAGGGCCTO 1980 

CCACAAATTG CCCX3CXTCCT GCAATCTGGC AACTCTGATG TGGTGCGGTC CQGAGCCTCX: 2040 

CTCCTGAGCA ACATGTCCCG CCACCCTCTG CTGCACAGAG TGATGGGGAA CCAGGTGTTC 2100 

CCGOAGGTGA CCAGGCTCCT CACCAGCCAC ACTGGCAATA CCAGCAACTC CGAAGACATC 2160 

TTGTCCTCGG CCTGCTACAC TGT6AGGAAC CTGATQGCCT CGCAGCCACA ACTGGCCAAG 2220 

CA6TACTTCT CCAGCAGCAT GCTCAACAAC ATCATCAACC TGTGCCX3AAG CAGTGCCTCA 22 BO 

CCCAAGGCCG CA6AAGCTGC COGGCTTCTC CTGTCTGACA TGTGGTCCAG CAAGGAACTG 2340 

CAGGGTGTCC TCAGACAGCA AGGTTTCGAT AGGAACATGC TGGGAACCTT AGCTGGGGCC 2400 

AACAGCCTCA GGAACTTCAC CTCCCGATTC TAAGAAGAGA CTCTCCAAGC A AGTTA GQCT 2460 

TGCAGGAAQA TATGACCCAG CTGAGAAGCC CTCAGGCCTC GCTGGATGGG GTTTTCTGTC 2520 

CATCCTGTGC AGTATTTGGG AAAGTTCACA AGAAACTGAG AAGAAACCTA AAAACTGTGG 2580 

ATAGTGGAAA QATTTTTAGA TTTTTTTTTT CCTTGGGGAA ACTGQCAGGC AATGGGGGTT 2640 

AGGQAGGTTG GGGCX3GGGGG GGCTTTCTTG AGTTAAAGG6 GCTTATATGT GATGTCAATA 270O 

TTTCTTCCTC TGAGAAATGG TATATATATG TGTCTAATGT AAGTGTGTGC ATGCaTGTGC 2760 

GCGTGCATCT GTGTGTGTGT GAGTGTCTTA AAGCATAACC ACAAACTGCA AAAAGCTAGG 2820 

TAAGCTATTT TGTTGCAGCT CATAAGGTGG TGAAAAGQAC TCTCCTGTGT TTCTTACTCA 2880 

TAOGCAAGQA CRACATGTGC TTTTT6GTGA GCTQCTCATA ATTCCTGAAA TGTGTQ6TGC 2940 

CAGGGCAAGG G6GCCATCAC TCCAGTCAOG C0CTCIU3A00 AGTCCTGCAiQ GCTTCCTACC 3000 

AGTGGTCTCC AAGGGTGCAG GAGTAACTQG GGCTGGGCCA GCCTCCCCCC TTACAAGGCT 3060 

GCTTTCCACG AAGGGAGGTC TGOTGTATCT CATGGGAGAA TCTGGGGTGT CTGTAGTGTC 3120 

ACCCCTCCAG CAGCQCCACA AGGACTGAGQ TTGGGTAGGT GTGAGGTTCC AGAGGACAGC 3180 

AGGACACTCT C3SCaTACTTT GCCAAATGAG GOCTGCTCAG AGGAGTAGGA GCPGAAAGAT 3240 

GGT6CCTTCC AOCCTCTTGG G CJ S TGTGCC CATCAGA6CA GGCTCAQCCT GCAAAOGGGC 3300 

TGCATTCAOA G6TCTTGTAA TCTACTTGTT GCA6GAGAAA GAAG6TAAAA AAT6ATTTTT 3360 

TTAAQAAAAG CTATTTTATT GCAGCTCTTT CCCAAGAGCT GTTCTGGGAA TGQCTGGTCT 3420 

TCATATTCCC AGTGGAGAGG GGAACAAGTG GGGCTGGGCA TATACCTATT CCGGCTTCTA 3480 

GTGGGATGGA GTTGGGGTAT AGAAATTAAC CAGGAAGATO TTTCCACCAA GCCTGCTGTG 3540 

AGTGAATTOA GGGAGTGTTT GGGTCCCAGG A6ACTTQQAC OGGGGQAGTr TGGGTAGACT 3600 

AGGAAAGGAA AGTGCCATAT CA6GGTACCG GTAC0G6CAA GCTCACATCT CA6CCA66GG 3660 

CCATGCCCCA CTTCXXXTTGA CCCCAGCTGT CTTGTCTCCA CTCTGTGAAA CCCACAGGGG 3720 

ATGTGATAAA CAGGGCTATT AGGGGTATCA GCCACGTCX3A GCCCCCAGAC TCTGTGCACT 3780 

TCAGACCAGC AGCAQCAGGA GGGCTCC08A GGGCCTTATG A6AAAACCT6 TGTGGAC ATC 3840 

CCTTGGTGTA CACTAAGACA GAGGAGAGCC CAGCGCTCCC AAGCCTTCCT CCTTCCA6CT 3900 

TCTACCTCCA TGCTAGCATT GCTQOTGTTA GASAQGAATT AACTTCCP6G TCTGTQCCCT 3960 

TCTCTA6AAG AATATAAGAT GCTCCTCCTC CTCACCCCTT CTCAGCCTCC TCCCAAGTCT 4020 

TCCTCTTCTG CACCACCCCC GAGTCCAAAC CCACCTCTTG CCCCAGCATT CAGGCTGGAA 4080 

AACACTGATG TGGACTCAGT ATGACAACTG AGATGGGG6A AGCCAGACAT GTGAGGACGC 4140 

TGTCCTCC3aA GftGGTGTCCC OGGCTOTTAG CCAGCIGTGC T6T6GTGCTO TGGGTCTGTC 4200 

ATAOCCTCCC TTGCTTCTGT TCACACTGGG AGGCCCACTC CTGGCTCACC TCTCCCTCTC 4260 

AGGGACXXAC GTGGGAGCCT GGATCCCT6G ACTGTCCTGG GCATAQQTTT CAGGGGCCTC 4320 

CTTTGTTGTC ATCAQAACCC AGAGGAATTC TTCTCCTAAA AAATACGTAT GGCATACCAA 4380 

TCTGTGCGG6 GCS^TGTCCT AAGCACTTAG ACTACATCAG GQAAGAACAC AGACCACATC 4440 

CCOSTOCTCA TGOQGCTTAT OTTTTCTGGA GGAAAGTGGA GACACAAGTC CrTGGCTTTA 4500 

G6GCTCCCCC GGCTG6GGGC TGT6CAGTCC GGTCA6GGCG GGAGGGGAAA TGCACCGCTG 4560 

CATGTGAACC TTACCAGCCC AGGCGGATGC CCCTTCCCCT TAGCACTACC CTGGCCTCCT 4620 

GCATCCCCTC GCCTCATGTT CCTCCCACCT TCAAAGAATG AAGAGCCXfCA TGGGCCCAflC 4680 

CCCTGCCCTG GGAACCAGGC AGCCTTCCAG ACCTCAGGGG CTGAGGCAGA CTATTAGGGC 4740 

AfiGQCTGACT TrGGTOACAC TGCCCATTCC CTCTCAGGCC AGCTCAGGTC ACCCOGGCCT 4800 

CTGACCCAQG CCTGTCACTT TGAGAGGGGC AAAACTGAGA GGGGCTTTTC CTAGAGAAAG 4860 

AGAACAAGGA 6CTT6CCAGG CTTCATGTAG CCGACACAOG TCTCAGGATT TTAAGTCCAC 4920 

ATTGGCCTCA CACTAGCCTA GGCCAATGCC CAAAATAAGG AGTTCCAATT TGGGGCCAAA 4 980 

T6AGGAA66A CACAGACTCT GCCCTGGGAT CTCCTGTGCT AGOGGCCAAT 6ACAAATCCA 5040 

GTCATTGGCC ACCAGCCACC TCTGCAGTGG GGACCACACT AGCAGCCCTG ACTCCACACT 5100 

CCTCCTGGGG ACCCAAGAGG CAGTGTTGCT GTCTOOGTGT CCACCTTGGA ATCTGGCTGA 5160 

ACTGGCTGGG AGGACCAAGA CTGCGGCTGG GGT6GGCAG6 GAAGGGAAGC GGGGGGCT6C 5220 

TGTGAGGQAT CTTGGAGCTT CCCTGTAQCC CACCTTOCCC TTGCTTCATG T WQTAGAGQ 5280 

AACCTTGTGC CGGCX3VGGCC CAGTTTCCTT GT6TQAXACA CTAAT6TATT TGCTTTTTTT 5340 
GGAAATAGAG AAAATCAATA AATTGCTAGT GTrTCTTTGA AAAAAAAAA 

8eq ID NO: 79 Protein sequence: 
Protein Accession «i CAA98022.1 

1 11 21 31 41 51 

1 I 1)11 

MNHSPLKTAL AYECFQDQDN STLALPSDQK MKTGTSGRQR VQEQVMMTVR RQXSKSSQSS 60 

TLSKSNR6SM YDGLADnYHY (STTSRSSYYS KFQAOIGSHO YPiyNOTIiKS EPDNRRFSSY 120 

SQMBNHSRHY PRGSQ^TTGA GSDICFHQKI KA6RSBPDLY QDPRGTLRKG TI^SKGQKTT 180 

QNRYSFYSTC SGQKAIKKCP VRPPSCASKQ DPVYIPPISC NKDLSPGHSR ASSKICSEDI 240 

ECSGLTIPKA VQYLSSQDEK YQAIGAYYIQ HTCFQDESAK QQVYQLGGIC KLVDIjLRSPN 300 

QNVQQAAAGA LRNLVFRSTT NKLETSfiOtlG IREAVSUiRR TGKABIQKQL TGLIiWNIiSST 360 
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DELKEELXAD ALFVIiADRVI IPFSGWCDGKf BNHSREWDP EVFFHATQCL RNLSSADA6R 
QTMRNYSGLI DSLMAYVQNC VAASRCDDKS VENCMCVLHN LSYSLDABVP TRYRQLEYNA 
mUVYTEKSST 6CFSNKSDXM MNNNYDCPLP EEETNFK6S6 WLYKSDAIRT YUniMSKSKK 
DATLEACAGA LQNLTASKGL MSSGM5QLI0 LKEKGLPQIA RLLQSGNSDV VRSOASLLSN 
MSRHPIJiHRV MGNQVPPEVT RLLTSHTGNT SNSEDILSSA CYTVRNLMAS QPQLAKQYFS 
SSMLNNIim) CRSSASPKAA EAARLLLSOM WSSKELQGVL RQQGFDRNML GTIAGAHSLR 
KFTSRF 



Seq ID NO: 80 DNA sequence 

Kucleic Acid Accession NM_0065X6.1 

Coding sequence; 160-1658 



PCT/US02/12476 



TACSrrCGCGGG 
GTCAGA6T06 
CGCACGCCCG 
TGGAGCCCAG 
TTGGCTCCCT 
AGGAGTTCTA 
TCA0CA06CT 
TCTCTQTGSe 
TGCTGGCCTT 
TGCTGATCCT 
CCATGTATGT 
AGCTGGGCAT 
GCAACAAGGA 
GCATCGTGCT 
AGAACCGGGC 
TGCAGGAGAT 
A6CTGTTCCG 
CCCAGCAGCT 
C3GGGGGTGCA 
CTGTCGTGTC 
TOGCTGGCAT 
TACCCTGGAT 
TGGGTCCTGG 
CAGCTGCCAT 
GCTTCCAGTA 
TGGTTCTGTT 
ATGAGATCGC 
AGCTGTTCCA 
GCCTGCTCCC 
AACCTGACAO 
GCAGAAOAAT 
AAATCTATTC 
ATATCAGCCT 
GAGGGTGGAG 
CTGGACCTAT 
6A6GTGGCTA 
CATTAGGATT 
CCTGAGACCA 
GCCGGGTTCT 
6GGAGCCTGC 
TGCAAGATAT 
ATATCTGGAC 
TATAAATG6C 
TTTGGAT06G 
GACTCAG6AT 
TTTQATCCCT 
ATCACATATT 
AGGCTTGAAA 



11 
I 

TCCCCGAGTO 
CAQTGGGAGT 
TCX3CCACCGG 
CA6CAAGAAG 
GCAGTTTGGC 
CAACCAGACA 
CTGGTCCCTC 
CCTTTTC3QTT 
CGTGTCCGCC 
GGGCCGCTTC 
GGGTGAAGTG 
CGTCGTC6GC 
CCTGTGGCCC 
GCCCTTCTGC 
CAAGAGTGTO 
GAAGGAAGAG 
CTCOCCOGCC 
GTCTGGCATC 
GCAGCCTGTG 
GCTGTTTGTG 
GGOGGGTTGT 
GTCCTATCTG 
CCCCATCCCA 
TGCCGTTGCA 
TGTGGA6CAA 
CTTCATCTTC 
TTCCGGCTTC 
TCCCCTGGGG 
A6CAGCCCTA 
ATGTCAGQOO 
ATTCAGGACT 
AGACAA6CAA 
GAGTCTCCTG 
ACTAAGCCCT 
GTCXTTAAOGA 
TGGCCACCCG 
TGCCCCTTCC 
GTTGGGAGCA 
AGTCTCCTTT 
AAACTCACTO 
TTATATATAT 
AAGCCAACTT 
TGGTTTTTAG 
AGTQA6ACA0 
CCAlGTCCCTT 
GTTATCCAGA 
TGATAGTTGG 
TCGCATTATT 



21 

I 

AGCAGGCCAG 
CCCC6GACG6 
CX3TACCG6GC 
CTGACGGGTC 
TACAACACTG 
TGGGTCCACC 
TCAGIQQCCA 
AACC6CTTTG 
GTGCTCATGG 
ATCATCX3GTG 
TCACCCACAG 
ATCCTCATCG 
CTGCTGCTGA 
CCCGAGAGTC 
CTAAAGAAGC 
AGTCGGCA6A 
TACTGOCAGC 
AACGCTGTCT 
TATGCCACCA 
GTGGAGC6AG 
GCCATACTCA 
AGCATCGTGG 
TGGTTCATCG 
GGCTTCTCCA 
CTGTGTGGTC 
ACCTACTTCA 
CGGCASGGQG 
GCTQATTCCC 
AGGATCTCTC 
AGCGGOGCCT 
TAAOSGCTCC 
CAGGTTTTAT 
TGCCCACATC 
GTCGAQACAC 
CACACIAATC 
TTCTGCTGGC 
CATCrCTTCC 
CTGGAGTGCA 
GCACTGAGGG 
CTCAAOAAGA 
TTTTQGrrOT 
GTAAATACAC 
AAACAT6GTT 
AAGTAAOTGG 
ACACGTAGCT 
GAATATATAC 
TGTTCAAAAA 
TTQAATGTOA 



31 
I 

GGAGCAGGA6 
GAGCACGAGC 
GCAGCCAGAG 
GCCTCATGCT 
QAGTCATCAA 
GCTATGGGGA 
TCTTTTCTGT 
GCGGGCGGAA 
GCTTCTCGAA 
TGTACTGCGG 
CCTTTCXjTGG 
CCCAGGTGTT 
GCATCATCTT 
CCCGCTTCCT 
TGCGCGGGAC 
TGATGCGGGA 
CCATCCTCAT 
TCTATTACTC 
TTGGCTCOGG 
CAGGCXSGCG 
TGACCATOGC 
CCATCTTTGG 
TGGCTGAACT 
ACTGGACCTC 
CCTAOQTCTT 
AAGTTCCTGA 
GAGCCAGCCA 
AAQTGTGAGT 
AGGAGCACAG 
GGGGCTCCIT 
AGGATTTTAA 
AATTTTTTTA 
CCAGGCTTCA 



GAACTATGAA 
CTGGATCTCC 
TACCCAACCA 
GGGAG6AGAG 
CCACACTATT 
GATOGAGACT 
CAATATTAAA 
CACCTCACTC 
TTGAAATGCT 
QGTTGCAACC 
CTCATCAGTG 
ATTCTTTATC 
AACACTAOTT 
AGGGAA 



41 

I 

ACCAAAC6AC 
CTGAGCX3GGA 
CCACCAGC6C 
GGCTGTGGGA 
TGCX:CCCCAG 
GAGCATCCTG 
TGGGGGCATG 
TTCAATGCTG 
ACTGGGCAAG 
CCTGACCACA 
GGCCCTGGGC 
06GCCTGGAC 
CATCCCX3GCC 
GCTCATCAAC 
AGCTGACGTG 
GAAGAAGGTC 
CGCTGT6GTG 
CACGAGCATC 
TAT06TCAAC 
GACCCTGCAC 
GCTAGCACTG 
CTTTGTGGCC 
CTTCAGCCA6 

aaatttcatt 
catcatcttc 
gactaaaggc 
aagtgataag 
cgccccagat 
gcagctggat 
tctccaggca 

CAAAAGCAAG 
TTACTGATTT 
CCCTGAATGG 
CACCCAGCTA 
CTACAAAGCr 
CCACTCTAGG 
CTCAAATTAA 
GGGAAGGGCC 
ACCATGAGAA 
CCTGCOCIGI 
TACA6ACACT 
CTGTTACTTA 
TGTGGATTGA 
ACTGCAACGG 
TCCTCTTGCT 
TTGACATTCA 
TTGTGCX^AGC 



Seq ID NO: 81 Protein sequence t 
Protein Accession #: NP_006507.1 



1 

) 

MEPSSXKLTG 
LTTLWSLSVA 
MLILGRFIIG 
GKKDLWPLLL 
LQEMKEESRQ 
AGVQQPVYAT 

CPQYVEQLCX3 
BLFHPLGADS 



11 

I 

RLMLAVGGAV 
IFSVGGMIGS 
VYOQLTTGPV 
SIIPIPALLQ 
MMREKKVTIL 
IGSGIVNTAP 
AIFGPVAFFE 
PYVFIIFTVL 
QV 



21 

1 

LOSLQFGYNT 
PSVGLPVNRP 
PMWGEVSPT 
CXVLPFCPES 
BLFR8PAYRQ 
TWSLFWER 
VGPGPIPWFI 
LVLFFIFTYF 



31 
I 

GVINAPQKVI 
GRRNSMLMMN 
AFRGALGTLH 
PRFLLINRNE 
PILIAWLQL 
AGRRTLHLIG 
VABLFSQGPR 
KVPETK6RTF 



41 

I 

EEFYNQTWVH 
IiIiAFV8AVU4 
QLGIWGILl 
ENRAKSVLKX 
8QQLSGINAV 
LAGMAGCAIL 
PAAIAVAGFS 
DBIASGFRQG 



51 

I 

RYGESILPTT 
GPSKLGRSFE 
AQVFQLDSZM 
LR8TADVTHD 
FYYSTSIPEK 
MTIALALLEQ 
NWTSNFIVGM 
6ASQSDKTPE 



Seq ID NO: B2 DNA sequence 
Nucleic Acid Accession #: BG001291 
Coding sequence: 44-541 



420 
480 
540 

600 
660 
720 



GGGGGTCGGA 


60 


GAG06CC6CT 


120 


AG06CT6CCA 


180 




240 


AAGGTGATOG 


300 


CCCACCACGC 


360 


ATTGGCTCCT 


420 


ATGATGAACC 


480 


TCCTTT6AGA 


540 


GGC7TCGTGC 


600 


ACCCTQCACC 


660 


TCCATCATGG 


720 


CTGCTGCAGT 


780 


CGCAACGAGG 


840 


ACCCATGACC 


900 


ACCATCCTGG 


960 


CTGCAGCTGT 


1020 


TTOGAGAAGG 


1080 


AOGGCCTTCA 


1140 


CTCATAG6CC 


1200 


CTGGAGCAGC 


1260 


TTCTTTGAAG 


1320 


GGTCCACGTC 


13S0 


GTGGGCATGT 


1440 


ACT6TGCTCC 


1500 


OGGACCTTCG 


1560 


ACRCCOSAGG 


1620 


CACCAGCCOG 


1680 


GAGACTTCCA 


1740 


GCAATGATGT 


1800 


ACTQTTOCTC 


1860 


TGTTATTTTT 


1920 


TTCCATGCCT 


19S0 


ATCTGTAGGG 


2040 


TCTATCCCAO 


2100 


GGTCAGGCTC 


2160 


TCTTTCTTTA 


2220 


AGTCTGGGCT 


2280 


GAGOGCCTGT 


2340 


TGTGTATAGA 


2400 


AAOTTATAGT 


2460 


CCTAAACRGA 


2520 


GGGTAGGAGG 


25B0 


CTTAGACTTC 


2640 


CAAAAATCTG 


2700 


AGGCATTTCT 


2760 


GGTGATGCTC 


2820 



60 
120 
180 
240 
300 
360 
420 
480 



11 



21 



31 
I 



GGGGGCGCCG OGCGCTGACC CTCCCTGGGC ACCGCTGGGG ACQATGGC6C T6CTCGCCTT 
GCTGCTGGTC GTGGCCCTAC 06CGGGT6T6 GACAGAOGCC AACCTGACTG CGAGACAACG 



60 
120 



219 



10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



WO 02/086443 

AGUVTCCAOAO GACTCCCAGC GAACGGAOGA 
T6A6AGAGAA AACACTTTCO AGT6CCAGAA 
CTGCGTTATA G0Q6CGGTGA AAATATTTCC 
OGCTGGTTGT GCAGC6ATGG AQAGACCCAA 
GCCCATGCCC TTCTTTTACC TCAAGTGTTG 
ACCTATCAAC TCATCAGTCST TCAAAGAATA 
GCTGTG6CTO GCCATCCTCC TGCTGCTGGC 
AGCCAC6GGA CT6CCACAGA CT6AGCCTTC 
ACCTGTTGCA TTAAACTTGT TTTCTGTTGA 
GC5GATGGGAG AGTGOSGATC AQOTGCAOTT 
ACATTCAQAQ GAAGTCCAGA TCTCCTGAGT 
AAATCAAACX: TTGTAACICA TTTATTGCTG 
CCrCTGAGGO CTTCASTATT GATGGGGA66 
TGCTOAQATG CTTCCGACCT TTCAGGTGAC 
GGGTGAAGAC ATCCCTGQAG TGAAGGACTC 
AGGGCTGCOC CCATTCCAGT GGTQGAGGCG 
CTACCAGATT GCAGGAGGCA GAA6ATAACT 
ACCAQCTGGC ACA6GTGCAC AGATTCATAA 
ACTTAGGCCA AGTAGAGA6C ATCAGGGTAA 
CATCCATGGG GAGCTGAGAA ATCAGACTCA 
TTCAAAAGrr CACGAAAAAA AAAAAAAAAA 

8eq ID NO: 83 Protein sequence: 
Protein Accession %t AAH01291 



PCTAJS02/12476 



GGGTGACAAT 
OCCAAQGAOG 
ACOTTTTTTC 
GCCAGA0GA6 
TAAAATTCGC 
TGCTQQQASC 
CTCCATTQCA 
0QGA6CATGQ 
TTACCTCTTO 
GGCTCTTAAC 
AGTGATTTT6 
A3G6CXIACTC 
6A86CCTAAG 
GCA66AACAC 
CTCAGCATG6 
CTOTGOATGG 
AATTGTGTTG 
ATTCCCACAC 
ATGGCC3TTCA 
AAGTTCCACC 
AAAAAAAAAA 



AGAGTGTGGT 
TOCRAAT GQA 
ATGOTTGCGA 
AAGCGGTTTC 
TACTGCAATT 
AIX3G6T8AGA 
GCGG6CCTCA 
ACTG6CTCGA 
GTTTGACTTC 
CCTCAAGGGT 
GTGACAAGTT 
TTTTCCTTGA 
TACCACTCAT 
T6GGGGAGTC 
GG6GCA6TGG 
CTOCTTTTCC 
AAOAAACTTA 
GTGTGTGTTC 
TTTCTCTGTT 
AAAAACAAAT 
AAAAAAAAAA 



GTCATGTTTG 
CftOAGCCATA 
AGCAOTGCTC 
TCCTQGAAGA 
TAGA0GG6CC 
6CT6TGGTGG 
6CCT6TCTT6 
GAC0QTT8TC 
CCAGGGTCTT 
TCTTTAACTC 
TTTCTCTTT6 

GGAGAGTATG 
TGAATGATTG 
GGCACACGTT 
TCAACCTTTC 
GACTTCACCC 
AACATCTGAA 
AAGATGCA6C 
ACAAGGGGAC 



11 



21 



41 



51 



TTTTTTTTTT 
TGCGCOVrCT 
TTTTCTCTOO 
06CC0GCCGC 
AAGCAAGGCA 
ATTCTTACAG 
CTCCTCACCT 
GAGCACAAAC 
CCTTCCCCTT 
GTCACGCCAG 
GAACACATAG 
GGAGCTCTAA 
GATGAGCCXA 
CTCTTGCAAC 
AGTCCCCTGA 
CCACCTCTCC 
GGATC3U3TAT 
CTGTTTAGTC 
GAAGAAATGG 
CCAAT06CTA 
AACACGTCTA 
CCATTCCAGC 
TCOGGCOCrC 
ACGTTCAAAT 
TACAAOTGCA 
AAGACGCACA 
GCCAOCTCCC 
TCGGTGGTGG 
GAGGA6GAA6 
CTGACGGAGA 
CACGAGAACA 
GACGTCATGC 
6TCCTGGGCG 
TGCGAC6AA6 
CGCGGCTGCT 
AGCCCCAGCT 
CCCCCGGCCA 
GCCTCCAGGC 
GCCTCCTOGT 
OAGCTGGACG 
ATTAGTGGTC 
GAGTACTGTG 
ACGGGOGAAA 
CTCACCA6GC 
TGTAA6ATGC 
GAT08A0TGT 
CTCCCACCTG 
CCTGTAG6AT 
ACGAAGCTAA 
TTCTTTTTTC 



11 

1 

TTTTTTGCTT 
TTGTATTATT 
AGTCTCCTTC 



AACCCCAGCA 
ATGATGAACC 
GTGGGCAGTG 
GOAAACAATG 
CACCAATOSA 
AAGATGACGA 
CAGATAAACT 
TCCCCACOCC 
GCAGCTACAC 
ACGCACAGAA 
CCCOGCGGGT 
ATGGGATTCA 
OGAGAGAGGC 
CACCACCGAO 
CCCTGGCCAC 
TGGAGCCTCC 
GCCCACCGCT 
CAG6TAGCAA 
CTCOCTCCCA 
TTCAQA6CAA 
ACCTGTGCGA 
TGCACAAATC 
C6GAACCCX3G 
GCAAGTTCAA 
AGGA6GA06A 
GC3GAGAGGGT 
GCTCGCGGGG 
AGGGCATGGT 
AGAA6CATAA 
ACTCGGTGGC 
CCCCGGGCGA 
CGCTGAGCCC 
CGATGCCCAA 
A6CTCAAAGA 
CGGAGCACTC 
GAGGGATCTC 
OGGGCACGGG 
QGAAAGTCTT 
GGCCTTATAA 
AC3VTGAAAAC 
CTTTTAG06T 
T6AATAAT6A 
ACACCCCCTT 
TTTTTTCTAG 
GAATATQAGA 
TTTTTCCTTT 



21 

I 

AAAAAAAAGC 
TCTAATTTAT 
TTTCTAACOC 
GCCX3C06COO 
CTTAAGCAAA 
AGACCACGGC 
CCAGATGAAC 
CAATGGCA6C 
GATGAAAAAA 
TTGTTTATCA 
TCTGCACTGQ 
TGGGATGAGT 
ATGTACAACT 
CACTGATGGA 
TGGTATCCCT 
TATTGCAGAC 
TTCCGGCCTG 
ACATCACTT6 
CCATCACC08 
aSCCATGGAT 
GTCCCCAGGC 
GCGGCOCTTC 
GCCOCOGQTC 
CCTGGTG6TG 
CCACGOGTGC 
GTCCCCCATG 
CACCAGCGAC 
GAGOGAGAAC 
0GAG6AAGAG 
GGACTAGGGC 
06CGGT06TG 
GCTCAGCTCC 
GCGOSGCCSVC 
OGGOGAGTCG 
GTC60CCTCG 
CTTCTCTAAO 
CACGGAQAAC 
TCCCTTCCTT 
CTCX3GAGAAC 
G6GGCX5CAGC 
CAOGCCCAaC 
CAAGAACTGT 
ATGOSAGCTG 
GCATGGCCAG 
GTACA6TACC 
TATAAAAACT 
TTTCACCACT 
TCCCATQTGA 
GTGCTTGTCA 
TTTTTTTTTT 



31 
I 

CATGACGGCT 
TTTGGATGTC 
GGCTCTCCCG 
CCOGCCCCXiC 
OGGGAATTCT 
CCGTTGGGAG 
TTCCCATTGG 
CTCTGCTTAO 
6CATCCAATC 
AOOTCATCTA 
AGGGOCCTCT 
6CAGAATATG 
TGCAAACAGC 
TTAAGAATCT 
TCAGGACTAG 
AATAACCCCT 
GCAGAAGGGC 
GACCX!CCACC 
A6T6CX7TTTG 
TTCTCTAGGA 
OGGCCCAOCC 
CTG60GAG6C 
AAGTCCAAGT 
CACCX3GC6CA 
ACCCA6GCCA 
AOGGTCAAGT 
TTGGTGG6CA 
GACCCCAACr 
GAAGAA6A66 
TTCGGGCTGA 
GGCGTGGGCG 
ATGCAGCACT 
CTGGCCGAGG 
GACX3GCATAG 
GGGGGCCTGT 
OGGATCAAGC 
GTGTACTOGC 
AGCTTCGGAG 
GGGAGCTTGC 
G6CACGGGAA 
TCAAAAGAGG 
AGCAATCTCA 
TQCAACTATQ 
GTGGGGAAGG 
CTGGAGAAAC 
GAATAGAOGT 
CCCTTTCCCC 
TTTAAACAAA 
CCAGCACACC 
TCCTTTATGT 



41 

I 

CTCCCACAAT 
AAAAG6CACT 
ATGTQAACOG 
AGCCCACCAT 
CJGCCCQAGCC 
CrcCAGAAGG 
GG6ACATTCT 
A AAAAG CTOT 
COGTGGAGQT 
GAAGAATTTG 
CCrCCCCTOG 
CCCCGCAGGG 
CATTCACCAG 
ACTTAGAAAG 
GTGCAGAAT6 
TTAACCTOCT 
QCTTTCCACC 
GCATAGAGCG 
ACAGGGTGCT 
GACTTAGAGA 
CTATGCAAAG 
CCCCCCTCCC 
CATGCGAGTT 
GCCACACGGG 
GCAAGCTGAA 
CGGAQ6A0GG 
GC6CCAGCAG 
TGATCCOGGA 
A66AAGAG6A 
GCCTGGAGGC 
ACGAGAGCOG 
TCAGCGAGGC 
CCGAQGGCCA 
ACGATGGCAC 
CCAAAAAGCT 
TCBAOAAGGA 
AGTGGCTCXX: 
ACrCCAGACA 
GCTTTCTCCAC 
GT66AGGGA6 
GC3U3ACQCAG 
CTGTCCACAG 
CCTGTGCCCA 
ACGTTTACAA 
ACATGAAAAA 
ATATIAATAC 
ATC36CCCTCC 
CAAACAAACA 
TGTTTTTTTT 
TCTCACOGTT 



51 
1 

TCATCTTOCC 
6ATGAAGATA 
AGC03T0QTC 
GTCTCGCOGC 
TCTTGAAQCC 
GGATCATGAC 
TATTTTTATC 
Q6ATAAGCCA 
TGGCATCCAO 
CCCCAAACAG 
TTCTGCACAT 
TATTTGTAAA 
T6CATG6TTT 
0QAACA06QA 
TCCTTCCCAG 
AAGAATACCA 
CACTCCCCCC 
CCTGGGGG06 
GOGGTTGAAT 
GCTGGCAGGG 
GTTACTGCAA 
TCCTCTGCAA 
CTGCXWCAAG 
C6AGAAGCCC 
GCGCCACATG 
TCTCTCCACC 
CGCGCTCAAG 
GAAGGGGGAC 
GGAGGAGQAG 
G60GC6CCAC 
GGCCCTGGCC 
CTTCCACCAG 
CAGGGACACT 
TGTTAATGGC 
GCTQCTGGGC 
GTTCGACCTG 
0GGCTA0GC6 
ATCQCCTTTT 
ACOQCCCGGG 
CaCXJCCCCAT 
OGACACTTGT 
GAGAAGCCAC 
GAGTAGCAAG 
ATGTQAAATT 
AT6GCACAGT 
GCCTOOCTCA 
AGCCCCACTC 
AACAGAAGTA 
CTTTTTCTTT 
T6AATGCAT6 



180 
240 

300 
360 
420 
480 
540 
GOO 
660 
720 
780 
640 
900 
960 
1020 

loao 

1140 
1200 
1260 
1320 



31 

.1 I I I I 

NAIiLALLLW AIiPRVWTDAN LTARQRDFBD SQRTDEGDKR VWCHVCERES TFECQNPRRC 
KWTBPYCVIA AVKIFPRPPM VAKQCSAGCA AMERPKPEEK RPLLBEPMPP FYUCCC3CIRY 
OniEOFPINS SVFKBYA6SN GESG66X«LA ZLIiIiIiASIAA GbSLS 

Seq lb NO: 84 DNA sequence 

Nucleic Acid Accession #t NM_022893.1 

Coding sequence: 229-2726 



60 
120 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1600 
1860 
1920 
1960 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 



220 



wo 02/086443 

ATCTQTATGQ GGCAATACTA TTGCATTTTA CGCRAACTTT GAGCCTTTCT CTTGTGCAAT 3060 
AATTTACATG TTGTGTATGT TTTTTTTTAA ACTTAGACAG CATGTATGGT ATGTTATGGC 3120 
TATTTTAAAT TGTCCCTAAT TCGTTGCTGA GCAAACATGT TGCTGTTTCC AGTTCC3GTTC 31B0 
TGAOAGAAAA AGAGAGAGAG AOAGAAAAAG ACCATGCTGC ATACATTCTG TAATACATAT 3240 
CATGTACAGT TTTATTTTAT AACGTQAGOA GGAAAAACaG TCTTT6QATT AACCCTCTAT 3300 
A6ACAGAATA GATA6CACTG AAAAAAAATC TCTATGAGCT AAATGTCTGT CTCTAAAGGG 3360 
TTAAATGTAT CAATTGGAAA GGAAGAAAAA AGGCCTTSAA TTGACRAATT AACAfiAAAAA 3420 
CAGAACAAGT TTATTCTATC ATrPGQTTTT AAAATATGAG TGCCTTCGAT CTATTAAAAC 34B0 
CACATCGATQ QTTCTTTCTA CTT6TTATAA ACTTGTA6CT TAATTCAGCA TTGG6TGAGG 3540 
TAATAAAC3CT TAG0AACTA0 CATATAATTC TATATTGTAT TTCTCACAAC AATGGCTACC 3600 
TAAAAAQATQ ACCCATTATG TCCTAGTTAA TCATCATTTT TCCTTTAGTT TAATTTTATA 3660 
AACAAAACTG ATTATACCAG TATAAAAGCT ACTTTGCTCC TQGTGAOAGC TTAAAAOAAA 3720 
TQGGCTGTTT TGCCCAAAGT TTTATTTTTT TTAAACAATG ATTAAATTGA ATGT6TAAT6 3780 
IGCAAAAGCC CTCGAAGGCA ATTAAATACA CTAGTAAGGA GTTCATTTTA TGAAGATATT 3840 
TOCTTTAATA ATGTCTTTTT AAAAATACTG GCACCAAAAG AAATAGATCC AGATCTACTT 3900 
GGTTGTCAAG TGGAC3W^TCA AATGATAAAC TTTAAGACCT TGTATACCAT ATTG AAAGGA 3960 
AGAGGCTGAC AATAAC3GTTT GACAGAGGGG AACAGAAGAA AATAATATGA TTTATTAGCA 4020 
C3UICGTGGTA CTATTTGCCA TTTAAAACTA GAACaU3GTAT ATAAGCTAAT ATTGATACAA 4080 
TGATGATTAA CTATGAATTC TTAAGACTTG CATTTAAATG TGACATTCTT AAAAAAAGAA 4140 
GAGAAAGAAT TTTAAGAGTA GCAGTATATA TGTCTGTGCT CCCTAAAAGT TG TACTTC AT 4200 
TTCTTTTCCA TACACTGTGT GCTATTTGTG TTAACATGQA AGAGGATTCA TTto -A"iiTxA T 4260 
TTTTATTTTT TTAATTTTTT CTTTTTTATT AAGCTAGCAT CTGGCGCAGT TGGTGTTCAA 4320 
ATAGCACTTG ACTCTOCCTG TOATATCTOT ATCTTTTCTC TAATCAGAGA TACAGAGGTT 4380 
GAGTATAAAA TAAAOCTGCT CAGATAGGAC AATTAAGTGC ACTGTACAAT TTTCCCAGTT 4440 
TACAGQTCTA TACTTAAGGG AAAAGTTGCA AGAATGCTGA AAAAAAATTG AACACAATCT 4500 
CATTGAGGAQ CATTTTTTAA AAACTAAAAA AAAAAAAACT TTGCCAGCCA TTTACTTGAC 4560 
TATTGAGCTT ACrTACTTOG ACGCAACATt GCAAGOGCTG TGAATGGAAA CaGAATACAC 4620 
TTAACATAGA AATGAATGAT TQCTTTC30CT TCTACAGTOC AAQGATTTTT TTGTACAAAA 4680 
CTTTTTTAAA TATAAATGTT AAGAAAAATT TTTTTTAAAA AACACTTCAT TATGTTTAGG 4740 
GGGGAACTGC ATTTTAGGGT TCCATTGTCT TGGTGGTGTT ACAAGACTTG TTATCC3VTTT 4800 
AAAAATGGTA GTGGAAATTC TATGCCTTGG ATACACACC6 CTCTTCAGGT TGTAAAAAAA 4860 
AAAAACATAC ATTGGQGAAA GGTTTAAGAT TAmTAGTAC TTAA ATATAG GAAAA1X3CAC 4920 
ACTCATGTTO ATTCCTATGC TAAAATACAt TTATOGTCTT TTTTCTGTAT TTCTAGAATQ 4980 
GTATTTGAAT TAAATGTTCA TCTAGT6TTA GGCACTATA3 TATT TATAT T GAAGCTTGTA 5040 
TTTTTAACTG TTGCTTGTTC TCTTAAAAGG TATCAATGTA CCTTTTTTGG TAGTGGAAAA 5100 
AAAAAAGACA GGCTGCCACA GTATATTTTT TTAATTTGGC AGGATAATAT AGTGCAAATT 5160 
ATrUGTATGC TTCAAAAAAA AAAAAAAGAG AOAAACAAAA AAGTGTGACA TTACAGATGA 5220 
GAAQCCATAT AATG6CS3GTT TGGOGGAGCC TOCTAGAATG TCACATGGAT GGCTGTCATA 5280 
GGGGTTGTAC ATATCCTTTT TTGTTCCTTT TTCCTGCTGC CATACTGTAT GCAGTACTGC 5340 
AAGCTAATAA CGTTGGTTTG TTATGTAGTG TGCTTTTTGr CXXTTTCCTT CTATCACCCT 5400 
ACATTCCA6C ATCTTACCTT CATATGCAGT AAAAGAAAGA AAGAAAAAAA AA GGAAA AAA 5460 
AAAAAAAAAC CAATGTTTTC CAGTTTTTTT CATTGCCAAA AACTAAATGG TGCTTTATAT 5520 
TTAGATTGQA AAOAATTTCA TATGCAAAGC ATATTAAAGA QAAAGCCCGC TTTAGTCAAT 5580 
ACTTTTTTGT AAATGGCAAT GCAGAATATT TTGTTATTGG CCTTTTCTAT TCCTGTAATG 5640 
AAAGCTGTTT GTCGTAACTT GAAATTTTAT CTTTTACTAT GGGftO T CACT yOTArrATT 5700 
GCTTATGTGC CCTGTTCAAA ACAGAGGCAC TTAATTTGAT CTTT TATTTT TCTTTOTTTT 5760 
TATTTTTTTT TTTATTTAGA TGACCAAAGG TCATTACAAC crGGCTTTTT ATTGTATTTG 5820 
TX T C ro aTCT TTGTTAAGTT CTATTGGAAA AACXaCTGTC TGTGTTTTTr TGGCAGTTGT 5860 
CTQCATTAAC CTOTTCATAC ACCCATTTTG TCCXTTTTATT GAAAAAATAA AAAAAATTAA 5940 
A 



Seq ID NO* 85 Protein sequence: 
Protein Accession #.* KP_075644.1 

1 11 21 31 41 51 

ilSRRKQGKPQ HLSKREP6PB PLEAILTDDE PDHGPLGAPE GDHDLLTOGQ CQraUPPLGDI 60 

LIFIEHKRKQ CNGSIiCLBKA VDKPPSPSPI EMKKASNPVB VGIQVTPEDD DCLSTSSRRI 120 

CPKQBHIADK LLHWRGLSSP RSAHGALIPT PGMSAEYAPQ GICKDEPSSY TCTTCKQPPT 180 

SANFLIiOHAQ NTHGIiRiyLB SEHGSPLTPR VGIPSGLSAE CPSQPPLHGI HIADNKTPFNL 240 

LRIPGSVSHE ASGLAEGRPP PTPPLPSPPP RHMPHRIE RLGAEEMALA THHPSAFDRV 300 

liRLNPMAMEP PAMDPSRRLR ELAGNTSSPP LSPGRPSPMQ RliQPFQPGS KPPFLATPPL 360 

PPLQSAPPPS QPPVKSKSCE FCGKTFKFQS NLWHRRSHT GBKPYKOJLC DHACTQASKL 420 

KRHMKTHMHK SSPMTVKSDD GLSTASSPEP GTSDLVGSAS SALKSWAKF KSENDPNLIP 480 

EHQDSSBEED P W«i^«k«rk BEBBIiTESBR VDYGFGLSLE AARHHQJSSR GAWGVGDES 540 

RALPDVMQGM VLSSMQHFSB AFHQVLGBKH KRGHLAEAEG HRDTCDEDSV AGBSDRIDDG 600 

TVNGRGCSPG ESASGGLSKK LLLGSPSSLS PFSKRIKLBK EFDLPPATMP NTENVYSQWIi 660 

AGYAASRQLK DPFLSFGDSR QSPPASSSEH SSENGSLRFS TPPGBIiDGGI SGRSGTOSGG 720 

STPHISGPGT GRPSSKBGRR SDTCEYCGKV FKNCSNLTVH RRSHTGERPY KCELQiYACA 780 
QSSKIiTRlIMK THGQVGKDVY KCBICKMPP8 VYSTLEKHMK KMHSDRVIMI DIKTB 



Seq ID NO: 86 DNA sequence 

Nucleic Acid AcceBSion #i XM_035292.2 

Gcxling sequezuze: 53-1576 

1 11 21 31 41 51 

GCTCGCTGGG CCGOGGCTCC OGGGTGTCCC AGGCCCGQCX: GOTGCGCAGA GCATG6CGGG 60 

TGCGGGCCCG AAGCGGCGCG CGCTAGOOGC GCCGGCGGCC 6AGGAGAAGG AAGAGGCGCG 120 

GGAGAAGATG CTCGCCGCCA AGAGCGCGGA CGGCTOGGCG CCGGCAGGCG AGGGCGAGGG 180 

OGTGACCCTG CAGCGGAACA TCACGCTGCT CAACG6CQTG 6CCATCATCG TGGGGACCAT 240 

TATGGGCrCO GGCATCTTOG TGR06CCCAC GGGCOTGCTC AAOGAGGCAG GCTCGCCX3GG 300 

GCTGGCGCTG GTGGTGTGGG CXIGOGTGOQG OGTCTTCTCC ATOGTGGGOS CGCTCTGCTA 360 

CGCGGAGCTC GGCACCACCA TCTCCAAATC GQGCGGCGAC TA0GCCTAC3V TGCTGGAGGT 420 

CTACGGCTCG CTGCCCGCCT TCCTCAAGCT CTGGATCGAG CTGCTCATCA TCCGGCCTTC 480 

ATCGCaCTAC KiamSGCCC TCGTCTTOGC CACCTACCTG CTCAAGCCGC TCTTCCCCAC S40 



221 
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WO 02/086443 

CT6CC0C36TG CC0GAG6AG6 CAGCCAA6CT 
G6C06TGAAC TGCTACAGC6 T6AAGGCC6C 
CAAGCTCCTG GCCCTGGCCC TGATCATCCT 
TGTGTCCAAT CTAGATCCCA ACTTCTCATT 
TGTX3CTGGCA TTATACAGCG GCCTCTTTOC 
CACAGAQGAA ATOATCAACC CCTACAGAAA 
CATCGTGA06 CTGGTGTA06 T6CTGACCAA 
6CAGATGCT6 T0C3TGCGAGG CCGTGGCGGT 
GTCCTGGATC ATCCCOGTCT TCGTGGGCCT 
QTTCACATCC TCCAGGCTCT TCTTCGTGGG 
CTCCATQATC CACCCACAGC TCCTCACCCC 
GACOCTGCTC TAOGCOTCT CCAAGGACAT 
CAACTGGCTC TGOGfTGGCCC TSOOCATCAT 
TC3AGCTTGAG CGGCCCATCA AGGTGAACCT 
CCTCTTCCTG ATCGCCGTCT CCTTCTGQAA 
CATCATCCTC AGCGGGCTGC CaSTCTACTT 
GTGGCTCCrC C3U3GGCATCT TCTCCACQAC 
CCCCCAGGAO ACATAGCCAG 6A66C0SAST 



PCTAJS02/12476 



CX3TGGCCTGC 
CS^CCCGQGTC 
6CTGG6CTTC 
TGAAGOCACXr 
CTATGOAGGA 



CCTOOCCTAC 
66ACTT06G6 
GTCCTGCTTC 
GTCCCGGGAA 

CTTCTCCGTC 
CG6CATGATC 
GGCCCTGCCT 
GACACCCGTG 
CTTC6GGGTC 
OGTCCTGTGT 
GGCTGCCGGA 



Seq ID NOt 87 Protein sequence: 
Protein Accession #: XP_0352 92.2 



MAGAGPRRKA 
GTIXGSGIFV 
LEVYGSLPAF 
U.TAVNCYSV 
©IIVLALYSG 
STEQMLSSEA 
SILSKIBPQL 
RKPBLERFIK 
KPKfOiLQGIF 



11 
I 

LAAPAAEEKE 
TPTGVLKBAG 
LKLWIELLII 
KAATRVQDAF 
LFAYGGVINyL 
VAVDFQWHL 
LTPVPSLVPT 
VKLALPVFFI 
STTVLCQKLM 



21 

1 

EARBKMLAAK 
SFGIiALWWA 
RPSSQYIVAL 
AAAKLLALAIt 
NFVTEEMINP 
GVMSHIIFVF 
CVMTLLYAFS 
LAOiFLIAVS 
QWPQST 



31 

I 

SADGSAPAGE 
ACGVFSIVGA 
VFATYIiLKPL 
IILLQFVQIG 
YKHLPLAZII 
VGLSCFGSVN 
KDZFSVINFF 
FWKTPVE06I 



CTCTGOGTGC 
CAQQATGCCT 
GTCCAGATCO 
AAACTGGATG 
TGGAATTACT 
GCCATCATCA 
TTCACCAOCC 
AACTATCACC 
0GCTCC6TCA 
GGCCACCTGC 
CTCQTGTTCA 
ATCAACTTCT 
TGGCTGCGCC 
GTOTTCTTCA 
GAGTGTGGCA 
TGGTGGAAAA 
CAGAAGCTCA 
GGAGCATGC 



41 
I 

GEGVTLQRNI 
LCYAEL6TTI 
FPTCPVPEEA 
KGDVSNLDPN 
SLPIVTLVYV 
GSliFTSSRIiF 
SFFKWLCVAIj 
GFTIILSGLP 



TGCTGCTCAC 
TT6C06C0QC 
GGAAG66TGA 
TGGGGAACAT 
TGAATTTOGT 
TCTCCCTGCC 
T6TCCAC08A 
TOGGOGTCAT 
ATGGGTCCCT 
CCTCCATCCT 
CGTGTGTQAT 
TCAGCTTCTT 
ACAGAAAGCC 
TCCTGGCCT6 
TCGGCTTCAC 
ACAAGCCCAA 
TGCAGGT6GT 



51 
1 

TliUJGVAIIV 
SXSGGDYAYM 
AKLVAC3.CVL 
FSFEGTKLDV 
LTNLAYFTTL 



AIIGHIHLRH 
VYFFGVWinCN 



Seq ID NO: 88 DNA sequence 

Nucleic Acid Accession «t NM_005268.l 

Coding sequooce: 168-989 



1 
I 

TAAAAAGCAA 
TCTGQATATG 
A6CCCTGAGG 
TCTTTGAGG6 
T6TCTCTGGT 
GTGATGACCA 
TTGATGAGTT 
CATGCCCCTC 
AOCGAGAAGC 
0T66GCTCTG 
TTCTCTATGT 
AOSCAQATCC 
TTTTCACCCT 
TCATCTACCT 
TGTGCACAG6 
C6GGTGACCT 
GAGAOCATGT 
CCTGGATGGG 
CATGAGGTAG 
TCAACTCCAG 
GCTOGGTTTC 



11 
I 

AAGAATTCGC 
AAATTCAAGC 
AGTAGTCACT 
ACrCCTGAlGT 
CTTCATCTTC 
CAAGGACTTC 
CTTCCCTGTG 
ACTGCTOGTG 
CCATGGG6AG 
GTGOACATAT 
GTTCCACTCA 
ATGTCCCAAT 
CTTCATGGTG 
GGTGAGCAAG 
TCATCACCCC 
CATCTTTCTG 
GAAGAAAACC 
GAGGCTCTAG 
GOG CAGG CAA 
CCACCTGCCC 
CTTTTCTAGA 



21 

I 

GGCOGGGTCG 
TGCTTGCTGA 
CAGTAGCAGC 
GGGGTCAACA 
OGOGriGCTGG 
GACT6CAATA 
TCCCATGTGC 
GTCATGCACG 
AACA6T6GGC 
GTCTGCAGCC 
TTCTACCCCA 
ATAGTGGACT 
GCCACAGCTG 
A6ATGCCACX3 
CAC6GTACXA 
GGCTCAGACA 
ATCTTGTGAG 
CATCTCTCAT 
GA6AQAGGAT 
CAGCTGGA06 
ATGGAAATAG 



31 
I 

ACACX3GGCTT 
GTCCTATTGC 
TGACGCGTGG 
AGTACTCCAC 
TGTACCTGGT 
CTOGCCAQCC 
GCXrrCTGGGC 
TGGCCTACCG 
GCCTCTACCT 
TAGTGTTCAA 
AATATATCCT 
GCTTCATCTC 
CCATCTGCAT 
A6TGCCIGGC 
CCT C TTC C TG 
GTCATCCTCC 
GGGCTGCCTG 
A6GTGCAACC 
TCAOACGCTC 
GCACTGGGCC 
TGAGGGOCAA 



41 

I 

CCCCGAAAAC 
CGGCTGCTGG 
GTCCACCATG 
AGCCTTTGGO 
GAOGGCCOAG 
CGGCTGCTCC 
CCTGCAGCTT 
GGAGGTTCAG 
GAACCCCSGC 
G606AG08T0 
CCCTCCT6TG 
CAAQCCCTCA 
CCTGCTCAAC 
AGCAAOGAAA 
GAAACAAGAC 
TCTCTTACCA 
GACTGGTCT6 
TGAGAGTGGG 
TGGGAGCCAO 
AGTTCCCCCT 
TGC 



51 
I 

CTTCCCCGCT 
6AGCCAGGAG 
AACTGGAGTA 
C6CATCT66C 
CGTGTGTGGA 
AACGTCTGCT 
ATCCTGGTGA 
GA6AAQAGGC 
AA6AAG06GG 
QACATGGOCT 
GTCAAGTGCC 
QAGAAQAACA 
CTCGTG6AGC 
GCTCAAGCCA 
GACCTCCTTT 
GACOGCCCCC 
GCAGGTTGGG 
GGAGCTAAGC 
TICCTAGTCC 
CTGCTCTGCA 



Seq ID NO: 89 Protein sequence i 
Protein Accession #: nf_005259.i 



11 



21 



51 



31 41 

III! 

MNHSIFEGIiL SGVNKYSTAF GRIHLSLVFI FRVLVYLVTA ERVWSDDHKD FDCNTRQFGC 
SNVCFDBFFP VSHVRLWAZfQ LILVTCPSLL WMHVAYREV QEKRHREAHG ENSGRLYUfP 
GKKRGGXiNWT YVCSIiVFKAS VDIAFLYVFH SFYPKYILPP WKCHADPCP NIVDCFISKP 
SEXNIFTIiFM VATAAICILL NLVBLIYLVS KRCHECLAAR KAQAMCTGHH PHGTTSSCKQ 
DDLLSGDLIF IiGSDSHPPLL FDRFRDHVKK TIL 



Seq ID NO: 90 DNA sequence 

Nucleic Acid Accession #: NM_00239l.l 

Coding sequence: 26-457 



600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 



60 
120 
180 
240 
300 
360 
420 
480 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 



60 
120 
180 
240 



1 11 21 31 41 51 

) I I 1 I I 

CGGGCGAAGC AG0G06GGCA G0GA6ATGCA GCACGGAGGC TTCCTCCTCC TCACCCTCCT 
G6CCCTGCT0 6C6CTCACCT C0GCGGT06C CAAAAAGAAA GATAAGGTQA AOAAGGGOGO 
CCCGGGGAGC GAGTGCGCTG AGTGGGCCTG GGGGGCCIGC AGGCOCAGCA GCAAGGATTG 
CGGCGTGGGT TTCCGOGAGG GCACCTQCGQ GGCCCAGACC CAGCGCATCC GQTGCAGGGT 
GCCCTGCAAC TGGAAQAAGG AGTTTGQAGC CQACTGCAAG TACAAGTTTG AGAACTGGGG 
TGCGTGTGAT GGGGGCACAG OCACCAAAST CC6CCAAGGC ACCCTGAAGA AGGOGOGCTA 



60 
120 
180 
240 

300 
360 



222 



600 
660 
720 



60 
120 
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CAATGCTCAG TGCXAGGAGA CCATCCGCGT O^CCAAGCCC TGCACCCCCA AGACCAAAGC 420 
AAAGGCCAAA GCCAAGAAAG GGAAGGGAAA GGACTAGACG CCAAGCCTGO ATGCC3UtGQA 480 
GCCCCTGGTG TCACATGGGG CCTGGCCACG CCCTCCCTCT CCCA6QCCCG AQATGTQRCC 540 
CACCAGTGCC TTCTOTCTOC TCOTTAGCTT TAATCAATCA T6CCCTGCCT TGTCCCTCTC 
ACTCCCCAQC CCCACCCXTTA AGTOCCCAAA GTGGGQAGGG ACAAGGGATT CTGGGAAGCT 
TGAGCCTCCC CCAAAGCAAT GMASTCCCA SAGCCOSCTT TTGTTCTTOC CCACAATTCC 
ATTACTAAGA AACACATCAA ATAAACTQAC TTTTTCCCCC CAATAAAAGC TCTTCTTTTT 780 
TAATAT 

6eq ID MOt 91 Protein sequence: 
Protein Accession NP_0023a2.l 

I II 21 31 41 51 

I t i 1 I I 

MQHRGPLIiLT LIALLALTSA VAKKKDKVKK GGPGSECAEW AWGPCTPSSK DOGVGFREGT 60 
GGAOTORIRC RVPCNWKKEF GADCKYKFEW WGACDGOTOT KVRQGTLKKA RYMAQCQBTI 120 
RVTKPCTPKT KAKAKAKK6K GKD 

Seg ID NO: 92 DKA sequence 

Nucleic Acid Accession #: NM_005130.1 

coding sequence: 98-802 

1 11 21 31 41 51 

I 1 I I I I 

CTCTACCTGA CACAGCTGCA GCCTGCAATT CACTCCCACT GCCTGGQATT GCACTGQATC 
OGTOTOCTCA OAACAAaGTG AAOGCCCaGC TGCRGCCATO AAQATCTGTA GCCTCACCCT 

GCTCTCCTTC CTCCTACTG6 CTGCTCAGGT GCTCCTOOTO GAGGOOAAAA AAAAAGTGAA 180 

GAATGGACTT CACAGCAAAG TGGTCTCAGA ACAAAAGOAC ACTCTGGGCA ACACCCA6AT 240 

TAAGCAGAAA AGCAGGCXTG GGAACAAAGG CAAOTTTGTC ACCAAAGACC AAGCCAACTG 300 

CAGATGGGCT GCTACTGAGC AGGAGGAGGG CATCTCTCTC AAGOTTGAGT GCACTCAATT 360 

GGACCaTGAA TTTTCCTGTO TCTTTGCTGG CAATCCAACC TCATGCCTAA AGCTCAAGGA 420 

TQAGAfiAGTC TATTGQAAAC AAOTTOCCCG GAATCTGOQC TC3W31GAAAG ACATCTGTAG 480 

ATATTCCAAG ACAGCTGTGA AAACCAGAGT GTGCAGAAAG 6ATTTTCCAG AATCCAGTCT 540 

TAAGCTAGTC AGCTCCACTC TATTTGGGAA CACAAAGCCC AGGAAGGAGA AAACAGA6AT 600 

GTCCOCCAGG GAGCACATCA AGGGCAAAGA GACCACCCCC TCTAGCCTAG CAGTGACCCA 660 

GACCATGOOC ACCAAAQCTC COQAGTGTGT QQAGGACCCA GATATGGCAA ACCAGAG6AA 720 

GACTGCCCTO GAGTTCTGTG GAGAGACTTQ GAGCTCTCTC TGCACATTCT TCCTCAGCAT 780 

AOTGCAGGAC ACGTCATGCT AATGAGGTCA AAAGAGAACG GGTTCCTTTA AGAGATGTCA 840 

TOTCGTAAGT CCCTCTGTAT ACTTTAAAGC TCTCTACAGT CCCCCCAAAA TATGAACTTT 900 

TGT6CTTAGT GAGTCCAAC6 AAATATTTAA ACAAGTTTTG TATTTTTTGC TTTTGTGTTT 960 

TOQAATTTOC CTTATTTTTC TTGGATGOGA TGTTCAGAGG CTGTTTCCTQ CA6CATGTAT 1020 

TTCCATGGCX: CACACftGCTA TCTGTTTGAG CAGCGAAGAG TCTTT6AGCT GAATGAGCCA 1080 

GAGTOATAAT TTCaOTGCAA CGAACTTTCT GCTGAATTAA TGGTAATAAA ACTCTGGGTQ 1140 
TTTTTCAAAA AAAAAAAAAA AAA 

Seq ID lK>t 93 Protein sequence t 
Protein Accession #: NP_005121.I 

1 11 21 31 41 51 

MKICSLTU.S FLLIiAAQVLL VBGKKKVKNG LHSKWSEQK DTLQNTQIKQ KSRPGNKGKP 60 

VTKDQANCRW AATEQEBGIS LKVECTQLDH EFSCVPAOIP TSCLKLKDER VYWKQVARNL 120 

RSQKDICRYS KTAVKTRVCR KDPPBSSLKL VSSTLPGNTK PRKEKTEMSP REHIKGKBTT IBO 
PSSLAVTOTM ATKAPBCVBD PDMANQRKTA IJBFCGETHSS LCTFFLSIVQ DTSC 

Seq ID NO: 94 DNA sequence 

Nucleic Acid Accession #: NM_012101 

Coding sequence: 125-1891 

1 11 21 31 41 51 

CTCCTCACAG GTGTGTCTCT AGTCCTCGTG GTTGCCTGCC CCACTCCCTG CCGAGACGCC 60 

TGCCAGAAAG GTCACCTATC CTGAACOCCA GCAAGCCTGA AACAGCTCAQ CCAAGCACCC 120 

TGOQATGGAA GCTGCAGATG CCTCCftGGAG CAAOGGGTCG AGCCCAGAAG CCAGGGATGC 180 

CCGGAGCCCG TCGG6CCCCA 6T06CAOCCT GGA6AATGGC ACCAAGGCTG ACGGCAAGGA 240 

TGCCAAGACC ACCAACGGGC ACGGCGGGGA GGCAGCTGAG GGCAAGAGCC TGGGCAGCGC 300 

CCT6AAGCCA iGGGGAAGGTA GGAGCGCCCT GTTCGCGGGC AAT6AGTGGC GGOGACCCAT 360 

CATCCAGTTT GTCGAGTCCG GGGACGACAA GAACTCCAAC TACTTCAGCA TGGACTCTAT 420 

GGAAGGCAAG AGGTC5GCCGT ACGCAGGGCT CCAGCTGGGG GCTGCCAAGA AGCCACCCGT 480 

TACCTTTQCC OAAAAGGGOG AC33TGCGCAA GTCCATTTTC TOGGAGTCCC GGAAGCCCAC 540 

GGTGTCCATC ATGGAGCCOG GGGAGACCCG GOGGAACAGC TACCCCCGGO CCGACAOGGO 600 

CCTTTTTTCA CGGTCCAAGT CCGGCTCCGA GGAGGTGCTO TG06ACT0CT GCATC3GGCRA 660 

CAAGCAGAAG GCGQTCAAGT CCTGCCTGGT GT6CCAGGCC TCCTTCTGCG AGCTGCATCP 720 

C3UVGCCCCAC CTGGAGGGCG CCGCCTTCCG AGACCACCAG CTGCTCGAGC CCATCCGGGA 780 

CTTTGAGGCC OGCAAGTGTC CCGTGCATGG CAAGACGATQ GAGCTCTTCT GCCAGACCGA 840 

CCAGACCTGC ATCTGCTACC TTTGCaTGTT CCAGGAGCAC AA6AATCATA GCACCGTGAC 900 

AGTGGAGGA6 GCCAAGGCCG AGAAGGAGAC GGAQCTGTCA CTGCAAAAGO AOCAGCTGCA 960 

GCTCAAGATC ATTGAGATTG AGGATGAAGC TGAGAAGTGG CAGAAGQAGA AGQACGGCAT 1020 

CAAGAGCTTC ACCACCRATG AGAAGGCCAT CCTGGAGCAG AACTTCCGGG ACCTGGTGCG 1080 

GGACCTGGAG AAGCAAAAGG AGGAAGTGAG GGCTGCGCTG GAGCAGCGGG AGCAGGATGC 1140 

TGTGGACCAA GTGAAG6TGA TCATGGATGC TCTGGATGAG AQAGCCAAGG TGCTGCATGA 1200 

GGACAAGCAG ACCCGGGAGC AGCTGCATAG CATCftGCGAC TCTGTOTTGT TTCT6CAG6A 1260 

ATTTGGTGCA TTGATOAGCA ATTACTCTCT CCCCCCACCC CTGCCCACCT ATCATGTCCT 1320 

GCTGGAGGGG GAGGGCCTGG GACAGTCACT AGGCAACTTC AA6GACGACC TGCTCAATGT 1380 

ATGCATGCGC CACGTTGAGA AGATGTGCAA GGCGGACCTG AGCCGTAACT TCATTGAGAG 1440 

GAACCACATG GAGAACGGTG GTGACCATCG CTAT6TGAAC AACTACACGA ACAGCTTCGG 1500 



223 
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GGGTOAOTGO AGTGCACCGG ACACCATGAA GAQATACTCC ATGTACCTGA CACKCAAAGG 1560 

TGGGGTCaSO ACATCATACC AGCCCTCQTC TCCTGGCCGC TTCACCAAGG AGACCACCCA 1620 

GAAGAATTTC AACAATCTCT ATGGCACCAA AGGTAACTAC ACCTCCOQGG TCTGGGAGTA 1680 

CTCCTCCAGC ATTCAGAACT CTQACAATGA CCTGCCCGTC GTCCAAGGCA GCTOCTCCTT 1740 

CrCCCTGAAA GGCTATCCCT CCCTCATGCG GAGCCAAAGC CXICAAGGCCC AGOCCCAOAC 1800 

TTGGAAATCr 6GCAAGCAGA CTATGCTGTC TCACTACCGG CCATTCTACG TCAACAAAG6 1860 

CAAOSGGATT GQOTCCAAOG AAGCCCCATG AGCTCCTGQC GGAAGGAAC6 AGGOQCCACA 1920 

CCCCTGCTCT TCCTCCTGAC CCTGCTGCTC TTGCCTTCTA AGCTACTGT6 CTTGTCTGGG 1980 

TCGGAGGGA6 CCTGGTCCTG CACCTGCCCT CTGCAGCXXJT CTGCCAGCCT CTTGGGGGCA 2040 

GTTCCGGCCT CTCCGACTTC CCCACTGGCC ACACTCCATT CAGACTCCTT TCCTGCCTTG 2100 

TGACCTCAGA TGGTCACCAT CATTCCTGTG CTCAGAGOCC AACCCATCAC AGGG6TGAGA 2160 

TA60TT06GG CCTGCCCTAA CCCGCCAGCC TCCTOCTCTC GGGCTOGATC TGGGGGCTAG 2220 

CAOTOAOTAC CXX5CATGGTA TCAGCCTGCC TCTCCXXSCXX: AC6CCCTGCT GTCTCCaGOC 2280 

CTATAGACGT TTCTCTCCAA GGCCCTATCC CCCRATGTTG TCAGCRGATG CCTGGACaGC 2340 

ACAGCCACCC ATCTCCCATT CACATGGCCC ACCTCCTOCT TCCCAOAGOA CTOGCCCTAC 2400 

GTGCTCTCrC TCX3TCCTACC TATCAATGCC CSiGCATGGCA GAACCTGCAO TG6CCAA0GG 2460 

CroCAGATGG AAACCTCTCA GTOTCTTGAC ATCACCCTAC CCAGGCXSGTG GGTCTCCACC 2520 

ACAGCCACTT TCAGTCTGTG GTCCCTGGAG GGTGGCTTCr CCTGACTGGC AGGAT6ACCT 2580 

TAGCCAAQAT ATTCCTCTGT TCCCTCTGCT GAGATAAAGA ATTCCCTTAA CATGATATAA 2640 

TCCACCCATG CAAATAGCTA CTGGCCCAGC TACCATTTAC CATTTOCCTA CAQAATTTCA 2700 

TTCAGTCTAC ACTTTGGCAT TCTCTCTGGC GATOGAGTGT GQCTGGGCTG ACOGCAAAAG 2760 

OTGCCTTACA CACTGCCCCC ACCCTCAGCC GTTGCCCCAT CAQAQGCTGC CTCCTCCTTC 2820 

TGATTACCCC CCATGTTGCA TATCAGGGTG CTCAAGQATT GGA6AGGAGA CAAAACCAGG 2880 

AGCAGCACAG TGGGGACATC TCCC6TCTCA ACA6CC0CAG GCCTAT GGGG GCT CTGGAA G 2940 

GATGGGCCAG CTTGCAGGGG TT6GGGAGGG AOACATCCRG CTTGGGCTTT CCCCTTTGGA 3000 
ATAAACCATT GGTCTGTC 

Seq ID NO: 95 Protein sequence: 
Protein Accession «t NF_036233.1 



1 11 21 31 41 51 

MEAADASRSK GSSPEARDAR SPSGPSGSLE NGTXADGKDA KTTlKaiGGEA ABGKSLGSAL 60 

KP6EGRSALP AflNEWHRPlI QPVBSGDDKN SNYPSKDSMB fflCRSPYAGLQ LGAAKKPPVT 120 

FABKGDVRKS IFSESRKPTV SIMEPGETRR KSYPRADTOL FSRSKS6SEB VLC35SCIGNK 180 

QKAVKSCLVC QASPCBLHLK PHLEGAAFRD HQLLEPIRDP EARKCPVHGK TMELFCXJTDQ 240 

TCICYIiCMFQ BHKNHSTVTV EBAKAEKBTE LSLQKEQLQL KIIEIEDEAE KMQKEKDRIK 300 

SPTINBKAIL BQNPRDLVRD LBKQKEEVRA ALEQRBQDAV DQVKVIMDAL DERAKVLHED 360 

KDTREQUISI SDSVLPLQBP GALMSNYStP PPLPTYHVLL EQEGLGQSLG NFKDDLUJVC 420 

MHHVBKMCKA OliSRUFIBRN HMBHOtamSY VHNYINSFGG EW8APDTMKR YSMYLTPKGG 480 

VRTSYQPSSP GHFTKETTQK NPHKfLYGTKG NYTSRVWBYS SSIQNSDNDL PWQGSSSPS 540 
LKGYPSIiMRS QSPKAQFQTW KSGXQTKLSH YRPFYVNK6H GI6SNBAP 

Seq ID NOt 96 ONA sequezice 

HuclelC Acid Accession <|t KM_060 666.1 

Coding sequence: 63-841 

1 11 21 31 41 51 

GQCACGAGG6 CAGCGAGTGG CCTTCCOGGT TGGCX3CG0GC CCGGGGC3GGC GGCGCTGGAG 60 

GAGCTOGAGA CGGAGCCTAQ TTATGTCTGG GAGGCGAAOG 0GGTCXX3GAG GA6CCSGCTCA 120 

GCGCTCCX5GG CCAAGGGCCC CATCTCCTAC TAAQCCTCTG CGGAGGTCCC AGCXSGAAATC 180 

AGGCTCT6AA CTCCOQAGCA TCXTTCCCTGA AATCTGGCCG AAGACACCCA GTGCX3GCTGC 240 

AGTCAGAAA6 OCCATOQTCT TAAAGAOQAT OQTQGCCCAT GCTGTAQAGG TCCCAGCTGT 300 

CCAATCa«XT CJOCAQGAGCC CTAGGATTTC CTTTTTCTTG GAGAAAGAAA ACGAGCOCCC 360 

TCGCAGGGAG CTTACTAAG6 AGQACXTTTTT CAAGACACAC AGCGTCCCTQ CCACCCCCAC 420 

CAGCACTCCT GTGCCGAACC CTGAGGCCGA GTCCAGCTCC AAGGAAGGA6 A6CTGGACGC 480 

CAGAGACTTO GAAATGTCTA AGAAAGTCAG GCGTTCCTAC AGCCGGCrGG AGACCCTGG6 540 

CTCTGCCTCr ACCTCCACOC CAGGCCGOCG GTCCTGCTTT GGCTTCGAGG GGCTGCTOGG 600 

GGCAGAAGAC TTOTCGGQAO TCTCGCCAGT GGTGTGCTCC AAACTCACCX3 AGGTCCCCAG €60 

GGTrrGTGCA AAGCCCTGGG CCCCAGACAT QACTCTCCCT GGAATCTCCC CACCACGC6A 720 

GAAACAGAAA OGTAAGAAGA AGAAAATGCC AGAGATCTTG AAAACG6A0C TQ SATGA GTG 780 

GOCTGCQGCC ATX3AATGCCG AGTTTGAAGC TGCTGAGCAG TTTGATCTOC TGCTTGAAT6 840 

AOATOOWnO GGGGGTGCAC CTGGCCAGAC TCTCCCTCCT GTCCTGTACA TAGCCACCTC 900 

CCTOTCGAGA GGACACTTAQ GGTCCCCTCC CXTrGGTCTTG TTACCTGTGT GTGTGCTGGT 960 

GCrGC3GCATG AGGACTGTCT GCCTTTGA6G GCTTGGGCAG CAGCGGCAGC CATCTTGQTT 1020 

TTAGGAAATG GGGCCOCCTG GCCCAGCCAC TCACTGGTGT CCTGTCTCTT GTCGTOCTGT 1080 

CCTTCCTATC TCCCCAAAGT ACCATAGCCA OTTTCC3U3AT GGGCCACAGA CTGQGOAGGA 1140 

GAATCAGTGG CCCAGCCAGA AGTTAAAGGG CTQAGGGTTG AGGTGAGAGG CACCTCTGCT 1200 

CTTOTTGGQA GGGGTGGCTG CTTGGAAATA GGCCCAGGGG CTCTGCCAQC CTCGGCCTCT 1260 

CCCTCCTOAG TTGCCTTCTG rrGGTGGCTT TCTTCTTGAA CCCACCTGTG TA AAGAGG TT 1320 

TTCAGTTCCG TGGGTTTCCC CTTTGATTCT GXAAATAGTC CCAQAGAGAA TTCOT^GCT 1380 

GAGGGCAATT CTGTCTTGQA GQAAGAAGCT GGACATTCAO CCTGTOGAGT CTGAGTTTTG 1440 

AAGGATGTAG GGAGCCTTAG TTGGGTCTCA GACX3VTAAGT GTGTACTACA CAGAAGCTGT 1500 

GTTTTCTAGT TCTGGTCTGC TGTTGAGATG TTTGGTAAAT GCCAGGTTGA TAGGGCGCTG 1560 

GCTGCTTGGA GCAAAGGGTG CATTTCAGGQ T6TGGCCACC AGGTGCTOTG AGTTTCTOTG 1620 

QCTCATGGCC TCTGQQCTGG TCCCPIGCRC AQQGCCCAOQ CTOGAQTCIT AOCACTCTGC 1680 

TGCAGGGGTG GAAGGTGOCC CCTCTTGTCa CCCATACCCA TTTCTTACAA AATAAGTTAC 1740 

ACCGAGTCTA CTTC6CCCTA GAAGAGAAAG TTGAAGAGTC CXAGACCTAC TAGCATTTTG 1800 

CAACTATGCT TGTAAAGTCC TCGGAAAGTT TCCTCGCGTA CCAGACAGCG GCX3GGQQCTG 1860 

ATAGCAATTT TAOTTTTTGG CCTCCCTATC CTCTCACATG AGAACACTGC CTGQATGCAT 1920 

CTCATGATCT CIGGftGAATT TCCCCATCTT TCTCTTCTTT OCATOGXGTG GATTCAATAG 1980 

rrrGQATTTG AAGGCTGCCC TGCCCCOSAC TCTCCTGCCG CACCCCTGGC CATTOTACCT 2040 

TTTGATGTTT AGAAGTTCGT GGAAGTAGAC GCTGAGGTGT GCAGAGGAGC TGGTGGATAA 2100 

CAGAGAATGC CAGGGAAGAT GAGTGCTGGG TCAGGGTACT TGGATGAAAC GGTCCAGGCC 2160 

AGGCGGGCCC TAATAAAACC CTCTGCCAGG TCTGGGAOTC CCAGGCXaiTC T6CTCAACGC 2220 
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TCTGTGGnr GTCAGACCTG CAA6CMU3CC CCCI6CTGGG 6AA6CCTA66 TGTCCTTGAO 
CTGAACCGCA CTQAHOIkACT CTTOTCCTCft CTOTCIOlttO CaGOOftACT CTTOOOWUtt 
GTCTTASICC roCDOAATCa QOAQTCACCK GKrOMaCAO AST TSM ATC ATCMTGCM 
ABTTCTCTOT TCCrOAGOMV CTAAA3TTAA OQAAAAAMKI OOKrm'UTT TTMAOTTSO 
AAMAAAOCC TGATTAAAOA O 'mUT a CCl' OITAkAAJkM AAMAAAAM AAAMA 

Seq ID NO: 97 Protein sequence t 
Pxocein Accession it IIP_542399.1 

1 11 '21 31 41 51 

I I I I I I 

M8GSRRTRSG6 AAQRSGPRAP 8PTKPLSRSQ RK8GSELPSI LFBINPKTP8 AAA VRKP IVL 
XRZVAH2WEV PAVQSPRRSP RISFFLBKEN EPP6RELTKE DLFKTBSVPA TPTSTPVPNF 
BAESSSKEGH LDARDLEMSK KVRRSYBRLB TLGSASTSTP 6RRSCF0FEG LLGAEDIiSSV 
SPWCSKLTE VPRVCRKPWA PDMTLPGISP PPEKQKRKECK KMPBILKTEIi DWAAAMMAB 
FEAAEQFDLL VE 
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I I I I 

GGGGCATTTC OSGGTCCGGG CTGAGCGGGC GCACX3CGC3GG 
GCGGGCTCCG GAGCCX3GTGT GCGTTGCTCC CTGCTGCGGC 
GCGGACCGCT 6CGGTGCTGC CXTTGCCCGGT CATCAACTGA 
TGCOTGCTGA 6CAGCAGCCC GGCSGTQCXG GCATTACAGA 
GATTTGGGTT TGCTTGTATT TGTCCGGAAG TCACTCAACA 
AGAGAAGAAA TCCTAAAGTT TTTATGTATT TTCTTAGAAA 
eCTTACTCTG TTGAAATTAA . GAACACTTGT ACCAGTGTTT 
AAAT6TAAAA TTCCAGCCCT GGACCTTCTT ATTAAGTTAC 
AGACTCATQ6 ATQAATTTAA AATTGGAGAA TTATTTAGTA 
TTGAAAAAAA AAATACCAGA TACAGTTTTA GAAAAA6TAT 
GGTGAAGTTC ATCCTAGTGA GATGATAAAT AATGCSVOAAA 
GGTGAACTTA AQACCCAGAT GACATCAGCA GTAAQAGAGC 
GGATGTCTGA AGG6GTTGTC CTCACTTCTG TGCAACTTCA 
CCOCAGACTT CAAGGQAGAT TTTTAATTTT GTACTAAAGG 
CTGAAGAGAT ATOCTGTGCC CTCAGCTGGC TTGCGCCTAT 
TTTAGCACCT GCCTTCTGGA CAACTACGTG TCTCTATTTG 
GCCCACACAA ATGTAGAATT GAAAAAAGCT GCACTTTCAQ 
CAGGTTTCTA ATATGGTGGC GAAAAATGCA GAAATGCATA 
ATGGAGCAGT TTTATGGAAT CATCAGAAAT GTGGATTCGA 
GCTATCCGTG GATATGGACT TTTTGCAGGA CCGTGCAAGG 
GACTTCATGT ACGTTGA6CT CATTCAGC6C TGCAAGCAGA 
ACTGGTGACG ACOGTGTTTA TCAQATGCCA AOCTTGCTCC 
CTGTACXnrrG ACACAGTTCC TGAGGTGTAT ACTOCAOTTC 
CAGATAGACA GTTTCCCACA GTACAGTCCA AAAATGCXGC 
GTOAAGGTGT TCCTAGCTTT GGCAGCAAAA GGGCCAGTTC 
0T6GTGCATC AGG6TTTAAT CAGAATATGT TCTAAACCAG 
GAGTCTGAAT CTGAAGACCA COGTGCTTCA 0GG6AAGTCA 
CCCACATAGA AAGACTAOGT GGATCTCTTC AGACATCTCC 
6ATTCTATTT TAGCAGATGA AGCATTTTTC TCTGTGAATT 
CATTTACTTT ATGATGAATT TGTAAAATCC GTTTTGAAGA 
ACACTTGAAA TACAGACTGT TG6GGAACAA GAGAATGGAG 
AT6ATCCCAA CTTGAGATCC AOGGGCTAAC TTGCATCCAO 
GCTTTCATTA ACCTGGTGQA ATTTTGCAGA QAQATTCTCC 
TTTGAACCAT GGGTGTACTC ATTTTCATAT GAATTAATTT 
CTCATCAGTG GTTTCTACAA ATTGCTTTCT ATTACAGTAA 
TATTTOGAGG 6AGTTAGTCC AAAGAGTCTG AAACACTCTC 
TCTTQCTTTG CTTTATTTST GAAATTTGGC AAAtaGGTGG 
AAAGAT6AAC TTTT6GCJCTC TTOTTTGACC TTTCTTCIGT 
GAACTCGATG TTAGAGC3CTA CGTTCCTGCA CTGCAGATGG 
TATACCCCCT TGGCAGAAGT AqCCCTGAAT GCTCTAGAAG 
AGACATGTAA TGCAGCCTTA TTACAAAGAC ATTCTCCCCT 
ACTTCAGCCT TOTGAGATQA GACCAAOAAT AACTGGOAAG 
GCCCAGAAAO GATTTAATAA AGT6GTGTTA AAGCATCTGA 
TCAAACGAAG CAATATGCTT AGAAGAAATA AGAATTAGAG 
CTAGGA6GAC AAATAAACAA AAATCTTCTG ACAGTCACGT 
A6CTATGTGG CCT6GGACAG AGAGAAGCGG CTGAGCTTTG 
AAACCTGTCA TTTTCCTGGA TOTGTTCCTG CCTCGAGTCA 
AGTGACAflAC AAACTAAAOT TGCAQOCTOT QAACTTTTAC 
TTGGGCAAAG CCACGCAGAT GCCAGAAGGG GGACAGGGAG 
TATAAGCGQA CGTTTCCTGT GCTGCTTOGA CTTGCXJTGTG 
CAACTGTATG AGCCACTAGT TATGCAGCTG ATTCACTGGT 
GAAAQTCAG6 ATACTGTTGC CTTACTAGAA GCTATATTGG 
GACAGTACTT TAAGAGATTT TTGTGGTCGQ TGTATTCGAG 
AAGCAAATAA CACCACAGCA GCAGGAGAAG A GTCC AGTAA 
CGACTTTATA GCCTTGC6CT TCACCCCAAT GCTTTCAAOA 
TTTAATAATA TCTACAGGGA ATTCAGGGAA 6AAGAGTCTC 
GAAGCCTTGG TGATATACAT QGAGAGTCTG 6CCTTAGCAC 
GGTACAATTC AACAGTGTTG TGATGCCATT GATCACCTAT 
CATGTTTCTT TAAATAAAGC AAAGAAAG6A CGTTT GCCGC 
TCATTGTGTT TATTGGATCT GGTCAAGTGG CTTTTAGCTC 
GAATGTOSAC ACAAATCCAT TGAACTCTTT TATAAATTOG 
AGATCXXXTTA ATTTGTGOCT GAAAQATGTT CTCAAGGAAG 
AACACCTTTG AGGGGGGTGG CTCTGGCCAG CCCTCGGGCA 
TTGTACCTTC GGGGGCCATT CAGCCTGCAG 6CCACGCTAT 



60 
120 
180 
240 



41 51 

I I 
GAGCGGGACT CGGCGGCATG 
TGCAGGAGAC CTTGTCCGCT 
TCCGCGGCCT GGGGCAGGAA 
CATCTTTAGT TTTTTCCA6A 
GTATT6AATT TCX3TGAATGT 
AAATGGGCCA GAAGATOSCA 
ATACAAAAGA TAGAGCTGCT 
TTCAGACTTT TAGAAGTTCT 
AATTCTATGQ AGAACTTGCA 
ATQAGCTCXn' AGGATTATTQ 
ACCTGTTCOQ CGCTTTTCTG 
CCAAACTACC TOTTCrGGCA 
CTAAGTCCAT GGAAGAAQAT 
CAATTCGTCC TCAGATTQAT 
TTGCCCTGCA TGCATCTCAG 
AAGTCTTGrr AAA6TGGT6T 
CCCTGGAATC CTTTCTGAAA 
AAAATAAACT GGAGTACTTT 
ACAACAAGGA GTTATCTATT 
TTATAAACGC AAAAGATGTT 
TGTTCCTCAC CCAGACAGAC 
AGTCTOTTOC AAGGQTCTTO 
TQQAOCACCT GGTGGTQAT9 
TGGTGTGTTG CAGAGCCATA 
TCAGGAATTG CATTAGTACT 
TGGTCCTTCC AAAGGGCCCT 
GAACTGGCAA ATGGAAGGTG 
TGAGCTCTGA CCAGAT6ATG 
CCTCCAGTGA AAGTCTGAAT 
TTGTTGAQAA ATTGQATCTT 
ATGAGGCGCC TGGT GTTTG G 
CTAAACCTAA AGATTTTTGB 
CTSAGAAACA AGCAGAATTT 
TGCAATCTAC AAGGTTGCCX: 
GAAATGCCAA GAAAATAAAA 
CTQAAGACCC AGAAAAGTAT 
CAGTTAAAAT GAAGCAGTAC 
CCTTGCCACA CAACATCATT 
CTTTCAAACT GGGCCTGAGC 
AATGGTCAAT TTATATTGAC 
GCCTGGATG6 ATACCTGAAG 
TGTCAGCTCT TTCTCGGGCT 
A6AAGACAAA GAACCTTTCA 
TAGTACAAAT GCTTGGATCT 
CCTCAGATGA GATGATGAAG 
CAQTGCCCTT TAQAGAGATG 
CAGAATTAGC GCTCACAGCC 
ATAGCATGGT TATGTTTATG 
CCCCACCCAT GTACCAGCTC 
ATGTTGATCA G6TGACAAGG 
TCACTAACAA CAA6AAATTT 
ATGGAATTGT GGACCCTGTT 
AATTCCTTAA ATGGTCCATT 
ACACCAAATC 6CTTTTC3AG 
GGCTGOGAGC ATCACTTGCC 
TGGTGGAACA GTTTQTGTTT 
ATGCAGATGA GAAGTCCTTA 
GCCGCATCAT TGAAAAGAAG 
GAGGATTTCC ACCTTCGGCA 
ATTGTGGGAG GCCCCAGACA 
TTCCTTTATT 6CCA6GCAAC 
AAGGTGTCTC TTTTCTCATC 
TCCTGGCCCA GCX:CACCCTC 
GCTGGCTGGA CCTGCTCCTG 
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GCXX5CGTTGG AOTQCTACAA CAOGTTCA.TT G6CGAGA6AA CTC3TAGGAGC GCTCCAGGTC 3900 

CTAOOTACTO AAOCCCAOTC TTCACTTTTQ AAA6CAGTG6 CTTTCTTCTT AGAAAGCATT 3960 

GCXATGCATG ACATTATAOC AGCAOAAAAG TGCTTTGGCA CTQGGGCAGC AGGTAACAGA 4020 

ACAAQCCCAC AAGAGGGAGA AAGGTACAAC TACAGOVAAT GCACC3GTTGT GGTCCXSGATT 4080 

ATGGAGTTTA CCAOGACTCT GCTAAACACC TCCCCGGAAG GATGGAAGCT CCTGAAGAAG 4140 

GACrTGTGTA ATACACACCT GATGAGAGTC CTGGTGCAGA OGCTGTGTGA QC00GCAA6C 4200 

ATAGGTTTCA ACATCGGAGA OGTCCAGGTT ATGGCTCATC TTCCTGATGT TTGTGTGAAT 42 SO 

CIGATGAAAG CTCTAAAGAT GTCCCCATAC AAA6ATATCC TA6AGACCCA TCTGAGAGAS 4320 

AAAATAACAG CACA6AGCAT TGAGOAGCTT TGTGCCGTCA ACTTQTATGG CCCTGA06CG 4380 

CAAGTGGACA GGAGCAQGCT GGCTGCTGTT GTGTCTGOCT GTAAACAGCT TCACAQAGCT 4440 

GGGCTTCreC ATAATATATT ACXX3TCTCA0 TCXaCRGATT TGCATCATTC TGTTG6CACA 4500 

GAACTTCTTT GCCTGGTTTA TAAAOGCATT GOCCCTGGAG ATGAGAGACA GTCTCTGCCT 4560 

TCTCTAGACC TCAOTTOTAA aCAGCTGGGC AGCGGACTTC TGGAGTTAGC CTTTGCTTTT 4620 

GGAGGACTQT GT6A0CGCCT TGTGAGTCTT CTCCTGAACC CAGCGGTGCT GT0CA0GGG6 4680 

TCCTTGGGCA GCTCACAGQG CAGCGTCATC CACTTCTCCC ATGGGGAGTA TTTCTATAOC 4740 

TTGTTCTCAG AAACGATCAA CAOGGAATTA TTGAAAAATC TGGATCTTGC TOTATTGQAG 4800 

CTCATQCAGT CTTCAGTOQA TAATACCAAA ATGGTGAGTG CCGTTTTGAA CGGCATGTTA 4860 

6ACCA6AGCT TCA66GA6C6 AGCAAACCAG AAACACCAAG GACTGAAACT TGOQACTACA 4920 

ATTCTCCAAC ACTGGAAGAA GTGTGATTCA TGGTGGGCCA AAGATTCOCC TCTCG AAACT 4980 

AAAATGGCAG TGCTGGCCTT ACTGGCAAAA ATTTTACAGA TTC3ATTCATC lOTATCTTTT 5040 

AATACAAGTC ATGGTTCATT CCCTGAAGTC TTTACAACAT ATATTAGTCT ACTTGCTOAC 5100 

ACAAAGCTGG ATCTACATTT AAAGGGCCAA GCTGTCACTC TTCTTCCATT CTTCACCAGC 5160 

CTCACTG6AG 6CAGTCTCGA GGAACTTAGA OGTGTTCTGG AGCAGCTCAT CGTTGCTCAC 5220 

TTCXrCATGC AGTCCAGGGA ATTTCCTCCA GGAACTCG6C G6TTCAATAA TTATG TGGAC 5280 

TGCATGAAAA AGTTTCTAOA TGCATTGGAA TTATCTCAAA 6C0CTATGTT GTTGGAATTG 5340 

ATGACA6AAG TTCTTTGTCQ GGAACAGCAG CATGTCATGG AAGAATTATT TCAATCCAGT 5400 

TTCAG6AGGA TTGCCAGAAG GGGTTCATGT GTCACACAAG TAGGCCTTCT GGAAAGCGTG 5460 

TATGAAATGT TCAQQAAGGA TOACXTCCCGC CTAAGTTTCA CACGCCAGTC CTTTGTGGAC 5520 

CGCTCCCTCC TCACTCTGCT GTGGCACTGT AGCCTGGATG CTTTGAGAQA ATTCTTCAGC 5580 

ACAATTGTGG TQGATGCCAT TGATGTGTTG AAOTOCAGGT TTACAAASCT AAA.TG AATCT 5640 

ACCTTTGATA CTCAAATCAC CAA6AAGATG GGCTACTATA AOATTCTASA Oa TQAT GTAT 5700 

TCTCGCCTTC CCAAAGATQA TGTTCATGCT AAGGAATCAA AAATTAATCA AGTTTTCCAT 5760 

GGCTCGTGTA TTACAGAAGG AAATGAACTT ACAAAGACAT TGATTAAATT G TGCT ACGAT 5820 

GCATTTACAG AGAACATGGC AGGAGAOAAT CAGCTGCTGG AfiAGGAGAAG ACTTTACCAT 5880 

TGIGCAGCAT ACAACIGCXSC CATATCTGTC ATCTOCTGTQ TCTTCA ATGA GTTAAA ATTT 5940 

TACCAAG6TT TTCTOTTTAO TGAAAAACCA GAAAAGAACT TGCTTATTTT TGAAAATCTG 6000 

ATOGACCTQA AGCGCOGCTA TAATTTTCCT 6TAQAAGTTG AGGTTCCTAT GGAAAGAAAG 6060 

AAAAAGTACA TTQAAATTAG GAAAGAAGCC AGAGAAGCAG CAAATGGGGA TTCAGATGGT 6120 

OCTTCCTATA TOTCTTCCCT GTCATATTTG GCAGACAGTA CCCTGAGTGA GGAAATGAGT 6180 

CftATTTOATT TCTCAACOGO AGTTCAOAOC TATTCAXACA GCTCCCAASA CCCTAGACCI 6240 

GCCACTGGTC GTTTTCG6A8 ACQGGAGCAG OGGGACCCCA CGGTGCATGA TGATGTGCT6 6300 

GAGCTOGAGA TGGAOGAGCT CAATCGGCAT 6AGTGCATGG CGCCCCTGAC GGCCCTGGTC 6360 

AAGCACATGC ACAQAAGCCT GGGCCCJGCCT CAAGGAGAAG AGQATTCAGT GCCSkAaAQAT 6420 

CTTCCTTCTT QGATGAAATT CCTCCATGGC AAACTGG6AA ATCCAATAOT ACCATTAAAT 6480 

AT0C30TCTCT TCTTAGCCAA GCTTGTTATT AATACAGAAG A0GTCTTT08 CCCTTACQGG 6540 

AAOCACTQGC TTASCCCCTT GCTGCaGCTG GCTGCTTCTG AAAACAATGG AOGAGAAGCA 6600 

ATTCACTACA TGGTGGTTGA GATAGTGGCC ACTATTCTTT CATGGACAGG CTTGGCCACT 6660 

CCAACAGGGG TCCCTAAAGA TGAA6TGTTA GCAAATOQAT TGCTTAATTT CCTAATGAAA 6720 

CATGTCTTTC ATCCAAAAAG AGCTGTGTTT AQACACAACC TTOAAATTAT AAAGACCCTT 6780 

GTOGAGTGCT G6AA06ATT6 TTTATCCATC CCTTATAGGI TAATATTTGA AAAGTTTTCC 6840 

GGTAAAGATC CTAATTCTAA AGACAACTCA 6TASGGAT7C AATTGCTAOG CATOGTGATG 6900 

GCCRATGACC TGCCTCCCTA TGACCCACAG TGTGGCATCC AGAGTAGCGA ATACTTCCAG 6960 

GCTTTGGTGA- ATAATATGTC CTTTGTAAGA TATAAAGAAG TGTATGCXX5C TGCAGCAGAA 7020 

GTTCTAGGAC TTATACTTC36 ATATGTTATG GAQAGAAAAA ACATACTGGA GGA GTCT CTG 7080 

TaTGAACTOG TTGG8AAACA ATTGAAGCAA CATCAGAATA CTATGGAGQA CAAGTTTArr 7140 

GTGTGCTTGA ACAAAGTGAC CAAGACCTTC CL'lWl' CTTO CAOACAGGTT CATGAATGCT 7200 

GTGTTCTTTC TGCTGCCAAA ATTTCATGGA GTGTTGAAAA CACTCTQTCT GGAGGTGGTA 7260 

CTTTGTCGTG TGGAGGGAAT GACAGAGCTG TACTTCCAGT TAAAGAGCAA GGACTTCGTT 7320 

CAA6TCATGA GACATAGAGA TGATGAAAGA CAAAAAGTAT GTTTGGACAT AATTTATAAG 7380 

AIGATGCCAA AGTTAAAACC AGTAGAACTC CXJAGAACTTC TOAACCCCGT TGTGGAATTC 7440 

GTTTCCCATC CTTCTACAAC ATOTAGGQAA CAAATGTATA ATATTCTCAT GTGGATTCAT 7500 

GATAATTACA GAGATCCAGA AAGTGAGACA GATAATGACT CCCAGGAAAT ATTTAAGTTG 7560 

GCAAAAGAIG TCCTGATTCA AGGATTGATC GATQAQAACC CTGGACTTCA ATTAATTATT 7620 

C9GAAATTTCT GGAGCCATGA AACTAGQTTA CCTTCAAATA CCTTGGACCX5 GTTGCTGGCA 7680 

CTAAATTCCT TATATTCTOC TAAGATAGAA GTGCACrTTT TAAGTTTAGC AACAAATTTT 7740 

CIXSCTOSAAA TQACCAGCAT OAGOXAGAT TATCCAAACC CXaTGTTCXSA GCATCCTCTG 7800 

TCAGAATGCG AATTTCAGQA ATATACCATT GATTCTGATT GGC6TTTC0G AACTACTOTT 7860 

CTCACTCXm TGTTTGTGGA GACCCAGGCC TCCCAGGGCA CTCTCCAGAC CCGTAC0CA8 7920 

GAAGGGTCCC TCTCAGCTCG CTGGCX3^G GCAGGGCAGA TAA6GGCX»C CCAGCAGCAG 7980 

CATGACTTCA CACTGACACA GACTGCAGAT GGAAGAAGCT CATTTGATTG GCTGACC3GGG 8040 

A6CAGCACTG ACCCGCTGGT OGACCACACC AGTCCCTCAT CTGACTCCTT GCT GTTTG CC 8100 

CACAAGAGGA GTGAAAGGTT ACAGAGAGCA CCCTTQAAGT CAGTGGG6CC TGATTTTGGG 8160 

AAAAAAAGGC TGGGCCTTCC AGGGQAOQAG 6TGGATAACA AAGT6AAAG6 TGCGGCG6GC 8220 

CX^GACGGACC TACTAOQACT G0GCAGA06G TTTATGAGGG ACCAGGAGAA GCTCAGTTT6 8280 

ATGTATGCCA GAAAAGGC3GT TGCT6AGCAA AAACX3AGAGA A6GAAATCAA GAGTQAGTTA 8340 

AAAATGAAGC AGGATGCCCA GGTCGTTCTO TACAGAAGCT ACCGGCACGG AGACCTTCCT 8400 

GACATTCAGA TCAAGCACAO CAGCGTCATC ACCCOGTTAC AGGCCGTGGC 0CA6AGGGAC 8460 

CCAATAATTG CAAAACAGCT CTTTAGCAGC TTGTTTTCT6 OAATTTTGAA AGAGATGGAT 8520 

AAATTTAAGA CACTOTCT6A AAAAAACRAC ATCACTCftAA AOTTGCTTCA AGACTTCAAT flSflO 

CGTTTTCTTA ATACCACCTT CTCTTTCTTT CCACCCTTT6 TCTCTTGTAT TCAQQACATT 8640 

AGCTGTCAGC AOGCAGCCCT GCTGAGCCTC GACCCAGOGG CTGTTAGCGC TGGTTGCCTG 8700 

GCCAOCCTAC A6CA60CCGT GGGCATCCGC CTGCTA6AGG AGGCTCTGCT COGCCTGCTG 8760 

CCTGCTGAGC TGCCTGCCAA GG6AGTGC6T GGOAAGGCCC GQCtCCCTCC TQATOTCCTC 8820 

AGATGGGTG6 AGCTT6CTAA GCTGTATAGA TCAATTGQAO AATACGA06T CCTCGQTGGG 8880 

ATTTTTACCA GTGAGATAGG AACAAAGCAA ATCACTCAGA GTGCATTATT AGCAGAAGCC 8940 

AGAAGTGATT ATTCTGAAGC TGCTAAGCAG TATQATGAGG CTCTCAATAA ACAAGACTGG 9000 

GTAGATGGTG A6CCCACAGA AGCCGAGAAG GATTTTTGGG AACTT6CATC CCTTGACTGT 9060 
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TACAACCACC TTGCTGAGT6 
OAGAAGCCCC CAGACCTAAA 
CCTTACATGA TC06CAGCAA 
CTGACATTTA TTGACAAAGC 
TACAGTCAAG AGCTGAGTCT 
TACATTCAAA ATGGCATTCA 
CACCAAAGTA GACTCACCAA 
ATC3^GCTTTA TAAGCAAACA 
AACACCTGGA CAAACAOATA 
ATCATCACAA ATCQATGTTT 
GAA6ATAATA 6TATGAATGT 
GAGCA6GAA0 AAGATATCA6 
ATGATAGACA GT6C(S6GAA 
CT6CATAAAG AGTCAAAAAC 
CGCCTGAGCC ACTGCCGGAG 
AAAACAGTCT CTTTGTTGGA 
GCTTTC0GT6 ACCAGAACAT 
AGCAGTGAGC CAGCCTGCCT 
CTTTCTGGAT CCAGTTCAGA 
TTCCAGCACC TCTCTGAGGC 
AGCTGTGGGC CTGCAGCTGG 
CAACAGCTGC 6CAAGGAGGA 
TATCCAGCAC TTGTGGTGOA 
AGATTGAAGT TTCCTAGATT 
CTCATGACAA AAGAGATCTC 
ATGGTGGCCT TACTGGACAA 
ACTGATAACT ACCCGCAGGC 
TTCAAGGATA CTTCTACTGG 
TTGGATCAAG GAGGAGTGAT 
QAAGTGCTCT TTAAGGATTG 
AATAAAAAAA ACATTGAAAA 
GCTCCAGGCC TGGGGGCCTT 
AAACATTTTG GGAAAGGAGG 
ATTACCAACA T6CTACTTTT 
GAATGTTCAC CCTGQATGAG 
CCGG6TCAGT ATGACGGTAQ 
TTTGATGAGC GGGTGACAGT 
GGCCATGAGG AGAGGGAACA 
, CAGCGCGTGG A6C3U3CTCTT 
AGGCAOAGGQ CCCIGCAOCT 
TTAATTGAGT G6CTTGAAAA 
CAAGAGGAGA AGGCGGCTTA 
TGQCTGACAA AAATGTCAGG 
GCTAATCGTA CTGAAACAGT 
CTCTTAAAGC G66CCTT0GT 
TCCCACTTCG CCAGCTCTCA 
GACAGACATC TGAACAACTTT 
TTTGG6CATG CXSTTTGGATC 
OGOCTAACTC GCCA6TTTAT 
AGCATCATGG TACA06CACT 
ATGGATGTGT TTGTCAAGGA 
AAAAAAG6AG GGTCATGGAT 
CAGAAAATAT GTTAOGCTAA 
GATGAGCTAC TCCTGGGTCA 
CiGAGGAAGCA AAGATCACAA 
ACTCAAGTGA AGTGCCTGAT 
GAAGGATGGG AGCCCTGGAT 
GTTTAAAGAA TCTACTATAC 
CTAAAGAGAA ATGTCTTTTG 
TGAGTAAATG TGTATGGGTT 
AGGTTTATAG AAAGATA6AT 
TGATCAGCTT TCAAAGCATT 
TGGAG6AAAT OTGGGGAAGC 
CTCAGAAGGC TTCATCACCA 
TTGTAGAAGC AGCATAGGAA 
TTAGAAATGA CTGCATTTGA 
TTCTCTTCTA GTTTTGACAT 
TAOGAGGGCA AAAATTTTQ6 
OATACATAAA AGTGCTTTGC 
TAGGGATAGT ACTAAGCATT 
ATTCCTCATT TGGAGGAAAA 
ACAAAAGTGG CTCCTTCCCA 
AACTQTTTCT GATTGQCTTT 
TTTGGAGGCT CTTCTGTGAT 
CCAAAA6TA 



GAAATCACTT GAATACTGTT CTACAGOCAG TATAGACAGT 
TAAAATCTGG AGTGAACCAT TTTATCAGGA AACATATCTA 
GCTGAAGCT6 CTGCTCCAG6 6AGAG6CT8A CCA6TCCCTG 
TATGCACGQG GAGCTCCAGA A6GC6ATTCT AGAQCTTCAT 
GCTTTACCTC CTGCAAGATG ATGTTGACAG AGCCAAATAT 
GAGTTTTATG CAGAATTATT CTAGTA7TGA TGTCCTCTTA 
ATTGCAGTCT GTACAGGCTT TAACAGAAAT TCAGQAGTTC 
AGGCAATTTA TCATCTCAAG TTCCCCTTAA GAGACTTCTG 
TCCAQATGCT AAAATGGACC CAATGAACAT CTGGGATGAC 
CTTTCTCAGC AAAATAGAGG AGAAGCTTAC CCCTCTTCCA 
GQATCAAGAT GGAGACCCCA 6TGACAGGAT G6AAGT6CAA 
CTCCCTGATC AGGAGTTGCA AGTTTTCCAT GAAAATGAAO 
GCAGAACAAT TTCTCACTT6 CTATGAAACT ACTGAAGGAO 
CAGAGACGAT TGGCTGQTGA GCTGGGTGCA QAGCTACTGC 
CCGGTCCCAG GGCTGCTCTG AGCAGOOXSCT CACTGTGCTG 
TGAGAACAAC GTGTCAAGCT ACTTAAGCAA AAATATTCTG 
TCTCTTGGGT ACAACTTACA GGATCATAGC 6AATGCTCTC 
TGCTGAAATC GAGGAGGACA AQGCTAGAAG AATCTTA6A0 
GGATTCAGAG AAGGTGATGG CGGGTCTGTA CCAQAGAGCA 
TGTGCAGGCG GCTGAGGAGG AGGCCCAGCC TCCCTCCTGG 
GGTGATTGAT GCTTACATGA CGCTGGCAGA TTTCTGTGAC 
AGAGAAT6CA TCAGTTATTG ATTCTGCAGA ACTGCAGGOG 
GAAAATGTTG AAA6CTTTAA AATTAAATTC CAATQAAGCC 
ACTTCAGATT ATAGAACGGT ATCCAGAGGA GACTTTGA6C 
TTCCGTTCCC TGCTGGCACT TCATCAGCTG GATCAGCCAC 
AGACCAAGCC GTTGCTGTTC AGCACTCTGT GGAAGAAATC 
TATTGTTTAT CCCTTCATCA TAAGCAGC6A AAGCTATTCC 
TCATAAGAAT AAGGAGTTTG TGGCAAGGAT TAAAAGTAAG 
TCAAGATTTT ATTAATGCCT TAQATCAGCT CTCTAATCCT 
GAGCAATGAT GTAAGAGCTG AACTAGCAAA AACCCCTGTA 
AATGTATGAA AGAATGTATG CAGCCTTGGG TGACCCAAA6 
TAGAAGGAAG TTTATTCA6A CTTTTGGAAA AGAATTTGAT 
TTCTAAACTA CTGAGAATGA AGCTCAGTGA CTTCAACGAC 
AAAAATGAAC AAAGACTCAA AGCCCCCTGG GAATCTGAAA 
OQACTTCAAA GTOaAOTTCC TGAGAAATGA GCTG6AGATT 
GGGAAAGCCA TT60CAGAGT ACCA06TGC6 AATCGCOGGG 
CATGGCGTCT CTGOGAAGGC CCAAGCGCAT CATCATCOGT 
CCCTTTCCTG GTGAAGGGTG GCX5AGGACCT GOGQCAGGAC 
GCAGGTCATG AATGGGATCC TGGCCCAAGA CTCCGCCT6C 
OAGGACCTAT AGCQTTGTQC CCATGACCTC CAGOTTAOGA 
TACTGTTACC TTGAAGGACX: TTCTTTTGAA CACCATGTCC 
CCTGAGT6AT CCCAGGGCAC CGCGGTGTGA ATATAAAGAT 
AAAACATGAT GTTGGAGCTT ACATGCTAAT GTATAAGGGC 
CAOGTCTTTT AQAAAACX3AG AAAGTAAAGT GCCTGCTGAT 
GAGGATGAGT ACAA6CCCTG AGGCTTTCCT GGOGCTC06C 
GGCTCTGATA TGCATCAGCC ACIGGATCCT CGGGATTGGA 
TATGGTGGCC ATGGAGACTG GOGGGGTGAT CGG6ATCX3AC 
CX3CTACACAG TTTCTGCCAG TCCCTGAGTT 6ATGCCTTTT 
CAATCTGATG TTACCAAT6A AAGAAACGGG CCTTATGTAC 
CGGGGCCTTC C8CTCAGACC CTGGCCT6CT CAOCAACACC 
GCCCTCCTTT 6ATTGGAAAA ATTTTGAACA GAAAATGCTG 
TCAAGAAATA AATGTTGCTG AAAAAAATTG GTAGCCCOGA 
6AGAAAGTTA GCAGGTGCCA ATCCAGCAGT CATTACTTGT 
TGAGAAGGCC CCTGCCTTCA GAGACTATGT G6CTGTGGCA 
CATTOGTGCC CAAGAACCAG AGAGTGGGCT TTCAGAAGAG 
GGACCAG6CA AGAGACCCCA ACATCCTTG6 CAGAACCT6G 
GTGAGGTCTG TGGGAGTCTG CAGATAGAAA GCATTACATT 
TTTGGTTGGC AGCATTCCAT OAGCTGATTT TCCTGAAACA 
TGCTACAGTT TOGTAGCATG AGTTTAAATC AAGATTATGA 
AAATCAAAGA TAAGGTTATA GTAACATCAA AGATTAGGTG 
ATCCAGGCTT ACCAAAGTAT TAAGTCAAGA ATATAATATG 
TACAA6T6CT GCAAGTTAGT GAAACAGCTG TCTCCGTAAA 
CTT OQAATGC CCnf T OGTL ' CTGOCACATT G6AAAGCACA 
AGATTTTGGG AGAGTAAAGC TAAGTATAGT TGATGTAACA 
CAATAAGAAC AATAGGTAAA GCTATAATTA TGGCTTATAT 
TATTTTAGGA TATTTTTCTA GGTTTTTTCC TTTdTTTTA 
TTTATGATAG ATTTGCTCTC TAGAAGGAAA CGTCITTATT 
TCATAGCATT CACTTTTGCT ATTCCAATCT ACAACTGGAA 
ATT6AATTTG GGATAACTTC AAAAATCOCA TGGTTGTTGT 
TCAGTTCCAG GAGAATAAAA GAAATTCCTA TTTGAAATGA 
AAAGCAT6CA TTCTAGCACA ACAAGATGAA ATTATGGAAT 
TGTGCAGTCC CTGTCCCCCC CCGCCAGTCC TCCACACCCA 
TAOCTTTTTG TTGTTTTTTT TTTTCCTTCT AACACTT6TA 
TTTGAGAAGT ATACTCTTGA GTGTTTAATA AAGTTTTTTT 
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9120 
9180 
9240 
9300 
9360 
9420 
9480 
9540 
9600 
9660 
9720 
9780 
9840 
9900 
9960 
10020 
10080 
10140 
102O0 
10260 
10320 
10380 
10440 
10500 
10560 
10620 
10680 
10740 

loeoo 

10860 
10920 
109B0 
11040 
11100 
11160 
11220 
11280 
11340 
11400 
11460 
11520 
11580 
11640 
11700 
11760 
11820 
118B0 
11940 
12000 
12060 
12120 
12180 
12240 
12300 
12360 
12420 
12460 
12540 
12600 
12660 
12720 
12780 
12840 
12900 
12960 
13020 
13080 
13140 
13200 
13260 
13320 
13380 
13440 
13500 



Seq ZD NO: 99 Protein sequence; 
Protein Accession #t NP_008835.5 

11 



1 11 21 31 41 51 

I i I I i .1 

MAGSGAGVRC SLLRLQBTLS AADRCGAALA GHQLIRGr.GQ ECVLSSSFAV LALQTSLVFS 60 

RDPGLLVPVR KSLNSIBPRE CREElLKFIiC IFLEKMGQKI APYSVEIKNT CTSVYTKDRA 120 

AKCKIPALDL LIKLLQTFRS SRLMDEFKIG BLFSKFYGEL ALKKECIPDTV LEKVYELLGL 180 

U3EVHPSQ4I NNAEHLPRAF LGELKTQMTS AVREPKLFVL AGCLKGLSSL LCHFTKSMEE 240 

DFQTSRBZFN FVLKAIRPQI DZiKRYAVPSA GXiRLFAIiHAS QFSTCLLDliy V8LFEVUJDI 300 

GAH3NVELKK AALSALSSFIi KQVSMNVAXN AEMHKNKLQY FMEQFYGIZR NVDSNHKBLS 360 



227 
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lAIRGYGLFA 6PCKVINAKD VDFMYVELIQ RCKQMFLTQT DTGDDRVYQN PSFLQSVASV 420 

IJ.YLDTVPEV YTPVIiEHLW NQZOSFPOYS PXHQLVC!CRA ZVKVFLAIAA RGFVLSNCZS 480 

TWHQGLIRI CSXPWLPKO PESBSBDRRA SGBVRTGKWK VPTYKXTWDL FRHLLSSDQM 540 

MDSILASEAF PSVNSSSESL KHLLYDEFVX SVLKIVEKLD LTLBIQTVGB QEHGDEAP6V 600 

WMIPTSDPAA NLHPAKPKDF SAFINLVEFC REILPEKQAE FFEPHVYSFS YELILQSTRL 660 

PLISGFYKLL SITVRMAKKI KYFEOVSPICS LRH8FEDPBK YSCPALFVKP 6KEVAVKMKQ 720 

YKDEIiXASCl. TFXiliSLPBKI ZELDVBAYVP ALQMAFKLGL SYTPLAEVOL NALEEWSIYZ 760 

DRSVMQPYYK DZLPCLDGYL ICTSALSDETK NNHBVSALSR AAQKGFNKW LKHLKKTKNL 840 

SSNBAISLEE IRIRWQMLG GL66QINKNL LTVTSSDEMM KSYVAWDREK RLSFAVPFRE 900 

MKPVIFLDVP LPRVTELALT ASDRQTKVAA CBLLHSMVMF MLGKATQMPE GGQQAPPMYQ 960 

LYXRTFPVLL RIiACDVZ>QVT RQLYEPZtVMQ LIHWPTHDIXX PB3QDTVALL EAILDOIVDP 1020 

VDSTLRDPC6 RCZREFIiKHS ZXQZTFQQQB KSFVMTRSLF KRLYSLALHP NAFKRLGASL 1080 

AEMNIYREFR EEESLVEQFV PBALVZYNBS lALAHADEKS LGTIQQGCDA IDHLCRIZEK 1140 

KHV8U7KAKX RRLPRGFPPS ASLCLLDLVK WLLAHOGRPQ TECRHKSZEL FYKFVPLLPG 1200 

NRSPNIiWLKD VLKEEGVSFL ZNTFBGQGCG QPSGllAQPT LLYLRGPPSL QATLCWLDIiL 1260 

UUVLECYNTF XOERTVGALQ VL6TEAQSSL LKAVAFFLES XAHHDZIAAE KCFGTGAAGN 1320 

RTSPQE6ERY NYSKCTVWR ZMBFTTTLIiN TSPEGWKLLK KDLCNTHLMR VLVQTLCEPA 1380 

SZOFNZGDVQ VMAHLPDVCV KZMKALXMSP YKDXLETHLR EKITAQSIEE LCAVNLYGPD 1440 

AQVDRSRLAA WSACRQLHR AGIiLHNZIiPS QSTDLHHSVG TELLSLVYKG lAPGDERQCIi 1500 

PSLDLSCKQL ASGLLEIAFA FGGLCERLVS LLLNPAVLST ASLGSSQGSV IKFSHQSYPy 1560 

SLFSETIMTE LLKNLDLAVL HLMQSSVDNT KMVSAVLNGM LDQSFRERAN QKUQGLXItAT 1620 

TZLQHWKKO) SWWAKDSPIiE TKMAVIALLA KILQIDSSVS PNTSHGSFPE VFTTYZSLLA 1680 

OTKLDLHLKG QAVTLLPFFT SIjTGGSLEKL RRVLEQLIVA HFPMQSREFP PGTPRFNNYV 1740 

DCMKKFLDAL BLSQSPMLLB U4TEVLCREQ QHVMBEIiFQS SFRRZARRGS CVTQVGLLBS 1800 

VYEMFRKDDP RL8FTRQSFV DRSLLTLLWH CSLDALREFF STZWDAZDV LKSRFTK12IE 1660 

STPDTQITKK MGYYKILDVM YSRLPKDDVH AKBSKZNQVF HQSCZTBGNB LTKTLZKLCY 1920 

DAPTENMAGE NQIiLERRRLY HCAAYNCAIS VlCCVFNBIiK FYQGFLPSEK PEKNLLZFEN 1980 

IjIDIiKRRYNP PVEVBVPMER KKKYZEZRKE AREAANGDSD GPSYMSSLSY LADSTLSBEM 2040 

SQFDFSTGVQ SYSYSSQDPR PATGRFRRRE QRDFTVBDDV ZiELEMDEZ«NR HECKAPLTAL 2100 

VKHMHRSLGP PQGEEDSVPR DLPSWMKFLH GKLGIIPZVPL NIRLFLAKLV ZHTEBVFRPY 2160 

AKHWLSPLLQ lAASBNNQGE GZHYMWEXV ATZLSHTGLA TPTGVPKDEV LANRIiLKPIrf* 2220 

KHVFHPKRAV PRHNIiEZZKT LVECKKDCLS IPYRLZFEKF SGKDPNSKDN SVGZQLLGZV 2280 

MANDLPPYDP QC3GZQSSEYF QALVNNMSFV RYKEVYAAAA EVLGLZLRYV MERKNXLEES 2340 

LCELVARQIiK QEQNTMEDKF ZVdiNlCVTKS FPPLAORFHN AVFFLLPKFH GVUCTLCIiBV 2400 

VLCRVBGHTB LYFQLKSKDF VQVMRHRDDB RQKVCLDIIY KMHPKLKPVE LRELLNFWB 2460 

FVSHPSTTCR EQMYNXLMWZ BDNYROPBSB TDND8QEZFK LAKDVLZQGL ZDBHPGLQLX 2520 

ZRNFWSHETR LPSNTLDRHi ALKSLYSPRZ EVHFIjSLATN FLLEMTSMSP DYPNPMFEHP 2580 

LSECEFQEYT ZDSDWRFRST VLTPMFVETQ ASQQTLQTRT QEQSLSARWP VAGQZRATQQ 2640 

QHDFTLTQTA DGR8SFDWLT GSSTDPIiVDH TSPSSDSLLF AHKRSERLQR APXiKSVGPDF 2700 

GKKRLGLPQ} EVONKVXQAA GRTDLUtZiRR RFMROQSKIiS LlflURKGVAB QKRBKBZXSE 2760 

LRMKQDAQW IiYRSYRHGDL PDIQZKRSSL ZTPLQAVAQR 0PXZAKQLF8 SLFSGZLKEM 2820 

DKPKTLSEKN NITQKLLQDF NRFLNTTFSP FPPPVSCZQD ZSCQHAALLS LDPAAVSAGC 2880 

IiASLQQPVGI RUjEBALLRIi LPAELPAKRV RGKARLPPDV LRWVBLAKLY RSIGEYDVLR 2940 

GZFTSBZGTK QXTQSALLAE ARSDYSEAAK QYDEALNKQD WVDGEPTEAE KDFWELASLD 3000 

CmHZiAEHKS LEYCSTASZD SBEIPFDXdiKZ WSEPFYQETY LFYNZRSKLK IiLLQQBADQS 3060 

LLTFZDKAMH GELQKAZLEL BYSQELSUiY LLQDDVDRAK YYIQN6ZQSF MQNYSSZDVXi 3120 

LHQSRLTKLQ SVQALTEIQE FZSFZSKQQf LSSQVPLKRL LNTWTNRYPD AKMDPMNZWD 3180 

DIZTNRCFFL SKZEBKLTPL PEDNSMNVDg DGDPSDSMBV QEQBEDXSSL IRSCKPSMKM 3240 

KMZDSARKQK NFSLAMfCLZ«R ELHKESKTRD DKLVSWVQSY CRItSKCRSRS QGCSEQVLTV 3300 

LKTVSLLDEKT KVSSYLSRNZ liAFRDONZZJi GTTYRZZANA LSSBPACLAB ZEEDKARRZIi 3360 

ELSGSSSEDS BKVIAGLYQR AFQHLSEAVQ AAEBEAQPPS HSG8PAAGVI DAYMTLADFC 3420 

DQQLRKEEBN A8VIDSAELQ AYPALWEKM LKALKUTSNE ARLKFPRLIiQ IZERYPEETL 3480 

SLMTKBISSV PCWQPISWIS HMVALLDKDQ AVAVQHSVEE ITDNYPQAIV YPPllSSBSY 3S40 

SFKDTSTGHK NKBFVARZKS KLDQGGVZQD FIHAUIQLSlf PELI«FKDHSN DVRAEIiAKTP 3600 

VHKKNZBXNY BRHYAAL(33P KAFGL6AFRR XFZQTFQCEF DKBFGKGGSR LLRMKLSDFN 3660 

DZlUMXiLLKM NKDSKPPCSIL KECSFHNSDF KVEFIiRNELB ZP6QYDGRGR PlfEYHVRlA 3720 

GFDSRVTVMA SLRRPKRXZZ RGHDEREHPF LVKGGEDLRQ DQRVEQLFQV MNGII»AQDSA 3780 

CSQRALQLRT ySWPMTSRL GLZEWIiENTV TLKDLLLNTM SQEEKAAYLS DPHAPPCEYK 3840 

DWLTRM8GKH DVGAYMLMYK GANRTETVTS FRKRBSKVPA DLLKRAFVRN STSPEAPLAL 3900 

RSHFASSBAL ZCZSHWZL6Z GDRHLMNFMV ANBTG6VZGZ DFGHAFGSAT QFLPVPBLMP 3960 

FRZiTRQFINli WtPHXBTQLN YSIUVHAZiRA FRSDPGLLIN 7MDVFVXEPS FDHKNFBQKM 4020 

UEKGGSWZQS ZKVAEKHHYP RQKZCYAKRK lAQANFAVXT CDBLLLGHBK APAEKDYVAV 4080 
ARSSKDBNZR AQBPES6LSE BXQVKCLMDQ ATDFNILQRT HBGWBPHM 

Seq ZD NO: 100 DNA sequence 
Nucleic Acid Accession fft NH_000673 
Coding sequence i 101-1225 

1 11 21 31 41 51 

I I I I I i 

ATGTGAAG6C ACAA6CTGCT GTTATATACA ACAGAGTGAA CTGAGCATCA GTCAGAAAAA 60 

GTCTATGT7T GCAGAAATAC AGATCCAAGA CAAAGACAGG ATGGGCACTG CTGGAAAAOT 120 

TATTAAATGC AAAGCAGCTG TGCTTTGGGA GCAQAAGCAA COCTTCTGCA TTGAGOAAAT 180 

AGAAGTTGCC CCACCAAAGA CTAAAGAAGT T06CATTAAG ATTTTGGCCA CAGGAATCTG 240 

TCX3CACA6AT GACCATGTGA TAAAAGGAAC 7ATGQTGTCC AAGTTTCCAQ TGATTGTGGG 300 

ACATGA6GCA ACTGGGATTG TAGAGAGCAT TGGAGAAGGA GTGACTACAG TGAAACCAGG 360 

TGACAAAGTC ATCCCTCTCT TTCTGCCACA ATGTAGAQAA TGCAATGCTT 6T0GCAACCC 420 

AGATGGCAAC CTTTGCATTA GGAGCQATAT TACTGQTOGT G6AGTACTGG CTGATG6CAC 460 

CACCAGATTT ACATGCAAGO 6CAAACCAGT ACACCACTTC ATGAACACCA 6TACATTTAC 540 

OGAGTACACA QTGGTGGATG AATCTTCTGT TGCTAAGATT GATGATGCAG CTCCTCCTGA 600 

GAAAGTCTGT TTAATTGGCT GTGGGTTTTC CACTGGATAT GGCGCTGCTG TTAAAACTGG 660 

CAAGGTCAAA CCTGGTTCCA CTTGOGTCGT CTTTGGCCTG GGAGGA6TTG GCCTGTCRGT 720 

CATCATGGGC TGTAAGTCAG CIGGTGCATC TAG6ATCATT GGQATT6ACC TCAACAAAGA 760 

CAAATTTGAG AAGGCCATGG CTGTA6GTGC CACTQAGTGT ATCAGTCCCSl AGGACTCTAC 840 

CAAACCCATC AGTOAGGTGC TGTCAGAAAT GAC2AGGCAAC AACX3TGGGAT ACACCTTTGA 900 

AGTTATT6GG CATCTTGAAA CCATGATTGA TGCTCTGGCA TCCTGCCACA TQAACTATGG 960 

GACCAGCGTG GTTGTAGGAG 7TCCTCCATC AGCCAAGATG CTCACCTATG ACCGGATGTT 1020 
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10 
15 
20 

25 
30 
35 
40 
45 
50 



GCTCTTCACT 
TGTCOCAAAA 
TCATGTTTTA 
CATTCX3AACG 
GAACTGGAGT 
ACAAGCATAA 
TATAAACATT 
TTGATTTACA 
TATGTTGAAA 
CAGATATAGC 
TTGAAACTAT 
TAACTTGGAT 
AGAAAGACAG 
ATAACTTGGT 
TATTAATATT 
ATATTATCAT 
OCTATTCACT 
CTAAACO& 



G0ACX3CACAT 
CTASTGACT6 
CCATTTAAAA 
GTCCTGACGT 
TTCTCTTGTG 
6TAGAAGATT 
TAAAGTCTTG 
TTTTGTAAG6 
TGGAGATTTT 
GTATAAAQAT 
TATTTTTTAG 
^CATTTTGA 
AAAA6ATTAA 
GAAACTGAAA 
TTAGAAAATA 
ACTTATOVTA 
GTGCTTAGTA 



GGAAGGGATG 
AGTTCCTGGC 
AAATCAGTGA 
TTTGAGATCC 
AGAGTTCCCT 
TGTTGAA6AC 
TQAGCACCTG 
CTATAATTGT 
TAA6AGTTTT 
ATAGTAAAT6 
ATTTGAATAT 
AATCAGTTCA 
6GGACGGGCA 
AAGTATATCA 
TTCCTTTTGT 
ATGTTCAATT 
GTGACTCCAT 



TGTCTTTGGA 
AAAGAAATTT 
AG6ATTTGAG 
AAAGTGQCAG 
CATCTGAAAT 
ATAGAACCCT 
GG AATTAO TA 
ATCTTTTAAG 
AACCAGCTGC 
CATCTCCCW3 
AAAT6TATTT 
TTCCATGATG 
CATTTTTCAA 
TATGGGTACa 
AATACTGAAT 
T6ATACAGTA 
TTAATAAAAA 



GGTTTGAAAA 
GAOCTGGAOC 
CT6CTCAATT 
GAGGTCTGTG 
C3^TGTATCTG 
TATAAA6AAT 
TAATAACAAT 
AAAACATACA 
TGCAGATATA 
AGTAATATTC 
TTTAAACACT 
CATATTACXO 
CGATTAA6AA 
CAAGGCTATT 
ATAAACATAG 
OAATTGCAA6 
GTGTTTTTAG 



GCA6AGATGA 
AGTTGATAAC 
CAG6ACAAAG 
TTGTCATGGT 
TCTCACAAAT 
TATTAACCTT 
GTTAATATTT 
CTTGGATTTC 
TAACTCAAAA 
ACTTAACACA 
TGTTATGAGT 
^TTAGATTA 
TCATCATTAC 
TGCCAGCATA 
AGCTAGAGTC 
TCCCTAAGTC 
'rm'TAACAA 
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Seq ID MO: 101 Protein sequence: 
Protein Accession Hit NP_000664 



1 

[ 

MGTAGKVIKC 
KFPVIVGHEA 
GVLADGTTRP 
QAAVKTGKVK 
ISPKDSTKPI 
LTYDPMLLFT 
LLNSGQSIRT 



I 

KAAVLWEQKQ 
TGIVESIGEG 
TOKGKPVHHF 
PQSTCWFGL 
SEVLSEMTGN 
GRTWKGCVFG 
VLTP 



21 
1 

PPSIEEIEVA 
VTTVKPGDKV 
MNTSTFTEYT 
GGVGLSVIKG 
NVGYTFEVriG 
GLKSRODVPK 



31 

1 

PPKTKEVRIK 
IPLPLPQCRE 
WDESSVAKI 
CK8AGASRII 
BLETMIDALA 
LVTEFIiAKKF 



41 

1 

ILATGICRTD 
CNACRNPDGIT 
DDAAPPEKVC 
GXDUnCDKFB 
SCHMNYGTSV 
DLDQXiITHVL 



SI 
I 

DHVIKQTMVS 
LCIRSDITGR 
liIGCGPSTGY 
KAMAVGATBC 
WGVPPSAXM 
PPKKZSBGFB 



Seq ID KO: 102 DNA sequence 

Nucleic Acid Accession #-. HK_00€7B3.1 

Coding sequence t 1 . . 786 



1 
I 

ATGGATTGGG 
GGGAAGGtGT 
CAQGAA6T6T 
AAAAATGTGT 
CTGATCTTCG 
GAAAGCACrC 
ATTAAAAAGC 
TTTTTCCX3AA 
TACCACCTGC 
TTTATTTCTA 
ATTTGCATGC 
AGATCAAAlQA 
CAGAATGAAA 
AGCTAA 



11 

1 

GGAGGCT6CA 
OGATCACAGT 
GGGGTGACGA 
GCTATGACCA 
TCTCCACCCC 
GCAAGTTCAG 
ACAAGGTTCG 
TCATCTTTGA 
CCTGGGTGTT 
GGCCAACAGA 
TGCTTAAOGT 
GAgCS^CAGAC 
TGAATGA6CT 



21 
I 

CACTTTCATC 
CATCTTTATT 
GCAAGAGGAC 
CTTTTTCCCG 
AGOGCTGCTG 
GGGAGGAGA6 
GATAGft GGGG 
AGCAGCCTTT 
GAAATGTGGG 
GAAGACOSTG 
6GCA6A6TTG 
GCAAAAAAAT 
GATTTCAGAT 



31 

I 

GGGGGTGTCA 
TTCCGAGTCA 
TTCX3TCT6CA 
GTGTCCCACA 
GT6GCCATGC 
AAGAGGAATG 
TGGCTGIG6T 
ATGTATGTGT 
ATTGACCCCT 
TTTACCATTT 
TGCTACCTGC 
CACCCCAATC 
AGTGGICAAA 



41 
1 

ACAAACACTC 
TGATCCTAGT 
ACACACTGCA 
TCCXIGCTGTG 
ATGTGGCCTA 
ATTTCAAAGA 
GGACGTACAC 
TTTACTTCCT 
GCCCCAACCT 
TTATGATTTC 
TGCTGAAAGT 
AT6CCCTAAA 
ATGCAATCAC 



51 
I 

CACCAGCATC 
GGTGGCTGCC 
AC06GGATGC 
GGCCCTCCAG 
CTACAGGCAC 
CATAGAG6AC 
C3kGCAGCATC 
TTACAAT0G6 
TGTTGACTGC 
TGCGTCT6TG 
GTGTTTTAGG 
GGAOAGTAAG 
A6GTTTCX3CA 



1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 



60 
120 
180 
240 
300 
360 



60 
120 
180 
240 
3O0 
360 
420 
480 
540 
600 
660 
720 
780 



55 



Seq ID NO: 103 Protein sequence: 
Protein Accession #: NP 006774.1 



60 
65 
70 
75 
80 
85 



1 11 21 31 41 51 

I I I I ] ] 

MDWGTLHTFI GGVNKHSTSI GKVHITVIFI PRVMILWAA QHVKGDEQSD FVCNTLQPGC 
KNVCYDHFFP VSHIRLWALQ LIFVSTPALL VAMHVAYYRH BTTRKFRRGE KRNDFKDIED 
IKKHKVRIEG SLWWTYTSSI PPRIIFEAAP MYVPYFLYNG YHI^WVLKCG IDPCPNLVDC 
FISRPTEKTV FTIFMISASV ICMItLKVAEL CmiLKVCFR RSKRAQTQKN HPtHZALKESK 
QNENHELISD SGONAITGFP S 

Seq ID NO: 104 DNA sequence 
Nucleic Acid Accession IfH_020411 
Coding sequence: 86-526 



1 

I 

GGACCTGGGA 
AAGGCA06AG 
ACTG^OGTC 
GGCGACTOGG 
ACAAACACAG 
GAACCAGCAG 
ACAGCTGAGA 
ACOGGGQATA 
ACACTGTAAA 
AAACAACGCA 
CAGCTTTCAC 



11 

I 

AGGAGCATAG 
GGAACCTCAC 
TTCCCA'fCGG 
GTCCCTGA6G 
AACCACACAO 
CTGAAAGTGG 
TCCCAGTGCG 
AATCT66ATT 
ATG0CA6AAG 
AGCTGGTTTT 
CAAAAAAAAA 



21 

I 

6ACAGGGCAA 
TGC6CATGCT 
CCGCTTOQCC 
TCTGGATTCT 
CCAGTCCCAG 
GGATCCTACA 
CGACATG6AA 
TGGGTT00G6 
CAOGTGAAOA 
ATATTAGATA 



31 

1 

GGCGGGATAA 
CCTTTGGTGC 
AGTQT6GGGA 
TTCTCOGCTA 
GAGCCCAGTA 
CCTGGGCAGC 
GGTOATCIGC 
CGTCAAGGTG 
6CAACCACAA 
TTTGACTTAA 



41 

I 

GGAGGGGCAC 
CCACCTCAGT 
AOGCGGCGQA 
CTGAGACAC6 
ATGGAGAGCC 
AGACAGAAGA 
AAGAQCTGCA 
AAGATAATAC 
GTTTAAATGA 
ACTATCTCAA 



51 

I 

CACAGCCCTT 
GCGCATGTTC 
GCTGTGAQCC 
G06GACACAC 
CCAAAAAGAA 
AGATCAGGAT 
TCA6TCAAAC 
CTAAAGAGGA 
AGACAAGCTG 
TAAAGTTTTG 



60 
120 
180 
240 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 



Seq ID NO: 105 Protein sequence: 
Protein Accession #: NP 065144.1 



21 



31 



41 



229 



5 

10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 
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I I I I I I 

MLIiWCPFQCA CSLGVFPSAP SPVHGTRRSC BPATRVPEVW ILSPUiRHGG HTQTQNHTAS 
PSfiPVMESPK KXNQQLKVGI LHLGSRQKKX RIQLRSQCAT WKVICSCSCZS QTPGI27IiDL6 

SGVKVKIIPK EEHCKMPEAG EBQPQV 

Seg ID NO} 106 DNA sequence 
Nucleic Acid Accession #: J04129 
Coding sequences 99-587 
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1 
I 

CATCCCTCTG 
TCACCCltaGG 
AGGACCTGGA 
ACATCTCCCT 
CCACCCCCGA 
AGAAGAAG6T 
TGGCGAAC6A 
AGGACACGAC 
AGGACQATGA 
GGTACTT6CT 
CCAGGAAGAC 
TTTCAAAGAA 
TCCTGCTGCA 
GCAGAOCSTTA 



11 

I 

GCTCCAGAGC 
CGTGGCCCTG 
GCTCCCAAAG 
CATGGCGACA 
GGACAACCTG 
CCTTGQAGAG 
G6CCACGCTG 
CACCCCCATC 
GATCATGCAG 
GGACTTQAAA 
CAGACTCCCA 
TAACCACAGC 
CACCT6CACC 
TTAATAAACC 



21 

1 

TCAGAGCCAC 
GTCTGTGGTQ 
TT6GCA8GGA 
CTGAAGGCCC 
GAQATCGTTC 
AAGACTGG6A 
CTCGATACTG 
CAGA6CAT6A 
G6ATTCATCA 
CAGAT06AAG 
CCCTTCX31CA 
TCAGAAGACG 
ATTGCCATGG 
CTTGGA6CAT 



31 

I 

CCACAGCCGC 
TCCCGGCCAT 
CCTGGCACTC 
CTCTQAGGOT 
TGCACAGATG 
ATCCAAAGAA 
ACXACGACAA 
TGTGCCAGTA 
GGGCTTTCAG 
AGC0STQCC6 
GCTCCAGAGC 
ATGAOGTGGT 
GGAQGCTGCT 
6 



41 

I 

AGCCATGCTG 
GGACATCCCC 
CATGGCCATG 
CCACATCACC 
GGA6AACAAC 
GTTCAAGATC 
TTTCCTGTTT 
CCTGGCCAGA 
6CCCCT6C0C 
TTTCTAGCTC 
AGTGGGACTT 
CATCTGTGTC 
CCCTGGG66C 



51 

I 

TGCCTCCTGC 
CAGACCAAGC 
GCGACCAACA 
TCACTGTT6C 
A6CT6TGTT6 
AACTATAGGG 
CTCTGCCTAC 
0TCCTG6TGG 
AOGCACCTAT 
ACCTCCGCCT 
CCTCCTGCCC 
GCCATCCCCT 
AGAGTCTCT6 



Seq ID NO: 107 Protein sequence i 
Protein Accession #t AAA60147 



41 



51 



1 11 21 31 

I I I i I I 

MDIPQTKQDIj ELPKLAGTWH SMAMATNNIS LMATIiKAPLR VHITSLLPTP EDNLEIVIiHR 
WENNSCVEKK VUSBKTGNPK KPICINYTVAN EATIiLDTDYD NFLPLCLQDT TTFIQSMMOQ 
YLARVLVEDD BIMQQFIRAF RFLFRHLWYL IiDLXQHEBPC RF 



Seq ID KO: 108 D2IA sequence 

Nucleic Acid Accession Eos sequence 

Coding sequence: 48-794 



TCCCAGGCA6 
GTCTGATCCA 
TCATQAAAGG 
CAGTA6CCTA 
TTGAGCAGAA 
GGGAGAAGGT 
GCCACCTCAT 
6TGACTACTA 
ACTCAGCCC6 
CCAACCCCAT 
ACAGCCCCGA 
TGCACACCCT 
ACAACCTGAC 
AGCCCCAGAG 
TGCOGAGAGG 
CTCCAAAGGG 
CACTCTTCTT 
CQCAOCCGCT 
CTGCGCCTGC 
GQACAGTGGC 
OGCGCGOGCC 
TTCCTCTCAA 



11 
1 

CAGTTAGCCC 
GAAGGCCAA6 
03CG8T06AG 
TAAGAACCTG 
AAGCAACGAG 
GGAGACTGAG 
CAAGGA GGCC 
CGGCTACCTG 
GTCAGCCTAC 
COGCCTGGGC 
GGAGGCCATC 
CA6CGAGQAC 
ACTGTGGA09 
CTGA6TGTT0 
ACTAGTATGG 
CTCCGTGGAG 
GCAGCTOTTG 
TCCTCCOSAC 
TGCCTCTGAT 
AGGGGCTGGA 
A6TGCAAGAC 
TAAAGTTCCC 



21 

I 

QCCGCCCGCC 
CTGGCAGAGC 
AAGOGGGAaO 
GTGGGGGGCC 
GAGGGCTCGG 
CTCCAGGGCG 
GG6QACGCCG 
6CC6A6GTGG 
CAQGAGGCCA 
CTOGCCCTGA 
TCTCTGQCCA 
TCCTACAAAG 
GCCGACAACG 
CCCGCCACCG 
GGTGGGAGGC 
AGGGACTGGC 
AGCGCACCTA 
CCCAGGACCA 
06TAGGAATT 
GATGGGTGTG 
OQAGATTQAa 
CTQTGACACT 



31 
I 

TGTGTGTCCC 
AOGCCGAAC6 
AGCTCTCCTG 
AGAGGGCTGC 
AGGAGAAGG6 
TGTGOGACAC 
AGAQCCGGGT 
CCACGGGTGA 
TGGACATCAG 
ACTTTTCCGT 
AGACCACTTT 
ACftGCACCCT 
CCGGGGAAGA 
CCCCGCCCTG 
CCCACCCTTC 
AGAGCTGA6G 
ACCACTGGTC 
GGCTACTTCT 
GAGGAOTGTC 
TGTGTOTGTG 
GQAAAfiCATG 
C 



41 

I 

CAGAGCCATG 
CTATGAGGAC 
CGAA6A8C6A 
CT6GAQ0GTG 
6CCC6AGGTG 
CGTGCTGGGC 
CTTCTACCTG 
CGACAAGAAG 
CAAGAAGGAG 
CTTCCACTAC 
CGAOGAGGCC 
CATCATGCAG 
GGGGGGCGAG 
CCCCCTCCAG 
TCCCCTAGGC 
CCACCT6GG6 
ATGCCCCCAC 
CCCCTCCTCT 
CC6CCTTGTG 
TGTGTGTGTQ 
TCTGCTGGGT 



51 
I 

GAGA6AGCCA 
ATGGCAGCCT 
AACCTGCTCT 
CTOTCCAGTA 
OGTGAGTACC 
CTGCTGGACA 
AAGATGAAGG 
CGCATCATTG 
ATGCOGCCCA 
GAGATC6CCA 
ATGGCTQATC 
CTGCTGCGAG 
GCTCCCCAQG 
TCCCCCACCC 
GCTGTTCTTG 
CTGt3Q(3ATOC 
CCCTGCrCTC 
TGCCTCCCTC 
GCTGAGAACT 
TGTGTGTGTQ 
GTGACCATGT 



Seq ID NO: 109 Protein sequence: 
Protein Accession #: NP_006133.1 



1 11 21 31 41 51 

1 1 I I .1 1 

MERASLIQKA KLAEQAERYE DMAAFMKGAV BKGEELSCEB RNLLSVAYKN WQGQRAAWR 
VliSSIEQKSH BEGSBEKGPE VREYREKVET SIiQGVCDTVL GLLDSHLIKE A6DABSRVFY 
LKHKGDYYRy LAEVATODDK KRIIDSARSA YQBAMDI8KK EMPPTNPIRL GLAXUFSVFH 
YSIANSPEEA ISLAKTTFDE AMADLRTLSE DSYKDSTLIM QLLRDNLTUf TAZ3NAGEB6G 
BAPQBPQS 

Seq ID NO: 110 DNA sequence 
Nucleic Acid Accession it NM_000695 
Coding sequence: 407-1564 



1 

CAOQAGTTGG 
GAG6CCTGGG 
TGGAGGTGCA 
CACACTGCGG 
GGCTGCGCAG 



11 



60 
120 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 



60 
120 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
640 
900 
960 
1020 
1080 
1140 
1200 
1260 



60 
120 
180 
240 



21 31 41 51 

I I I I I 

TTTG6GA6CT GCCAGTCTCC TG66AGGATC GCAGTCAGCA GAGCAGGGCT 
GGTAGGAGCA GA60CTGC6C ATCTGGAGGC AGCAT6TCCA AGAAAOGGAG 
GCGAAGGACC CAGGGGCAGA GCCCACGCTG GGGATGGACC CCTTC6AGGA 
OGGCTGCGTG AGGCCTTCAA CTGAGGGCGC AOGCGGCCGG CCGAGTTCCG 
CTCCAGGGCC TGGGCCACTT CCTTCAAGAA AACAAGCAGC TTCTGOGOGA 



60 
120 
180 
240 
300 



230 
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WO 02/086443 

CGTGCTGGCC CAflGACCTGC ATAA.GCCA6C 
TT6CCAGAAC GAGGTTQACT ACX3CTCTCAA 
A06GTCCACX3 AACCTGTTCA TGAAGCT6GA 
CCTGGTCCTC ATCATCGCAC CCTGGAACTA 
GQGCACCCTC CCCGCNSGOf^ ATTGCGTGGT 
AGAGAAGGTC CTGGCTGAGG TGCTGCCCCA 
6CTGQQCGGA GCGCAGQAQA CAG60CAGCT 
CACAGGGAGC CCTOGTGTGG GCAAGATTGT 
TGTCACCCTG GAGCTGGGGG 6CAAGAACCC 
6ACCGTGGCC AACCGCGTGG CCTGGTTCTG 
CCCTGACTAC 6TCCTGTGCA GCGCCGAGAT 
CACCATCACC CGTTTCTATG GCGAOGACCX 
CAACCAGAAA CAGTTCCAGC GGCTGCGGGC 
GGGCCAGAGC AACGAGA6CG ATCGCTACAT 
GACGGAGCCT GTGATGCAGG AGGAGATCTT 
GAGGGTGGAC GAGGCCATCA AGTTCATCAA 
CTTCTCCAAC AGCACSACAGG TTGTGAA(XA 
TGGAOGCAAT GAGGGCTTCA CCTACATATC 
CCACAGTGGG ATGGGCOGGT ACCAOGGCAA 
CACCTGCCT6 CTCGCCCCCT CCGGCCTGGA 
TACCGACTGG AACCAGCAGC TGTTACGCTG 
GTGAGCX3TCC CACCOGCCTC CftAOGQGTCA 
GCTTAT6CTC CCAACTCACA TTOTTCCTCC 
TGGAGCTGTC ACATGACTGC ATCCT6CCTG 
TCTGGGGGAC GCTGCTC!GAG AGAGGCXTGAG 
CACCCCACCC TCCCCAATTC CAGCCCTTTG 
CACA6GGGCA GTGTCACCCT GGAAAATACA 
GAACGGTTGA GAGCGTGGAG CCCTCCAGGC 
TTCXACCTCT GCCCCATCCC AACTGCACCA 
OCGACACTGG TCTCTGCACC ACCCCTCTGG 
A6CTCCATCC ACTGGGAAAA CTGGGGTTT6 
CTGGGGGCAA GTCCCTTGAC TTCTCTGAGC 
CCAAAATGGA GTCACTTATG CCAAACTCTA 
CCCTCAOVCA CACATGCCCQ TAACAG6ATT 
AGACACAGG6 OGTATGGAAA AGCACGTCCT 
OATGCTTACC TACCAOGOCC OTCTCCACCA 
TGTQACTTAC AAACCTTGTT TAAAAGCTOC 
CCXrrTGGCTG TGGCCCTCTQ TGTATGCCTG 
GGAATCCTCT GCTCCTCCCA AATAAATTCA 



Seq ID NO I 111 Protein sequence t 
Protein Accession ft: NP 000686 



TTTCGAGGCA 
GAACCTTCA6 
CTOGOTCTTC 
CCCATTGAAC 
GCTGAAGC06 
GTACCTGGAC 
GCTAGAGCAC 
CATQACTGCT 
CTGCTAOGTG 
CTACTTCAAT 
GCAGGAGAGG 
CCAGAGCTCC 
ArrGCTGGGC 
CGCCCCCAOG 
CX3GGCCCATC 
CCG6CAGGAG 
GATGCTGGAO 
TCTGCTGTCC 
GTTCACCTTC 
GAAATTAAA6 
GGGCATGGGC 
CACAGAGAAA 
AGACOGCAGG 
CCAGGGCTGC 
AGGCCGCAGA 
CCCTCT06GT 
GTGCCCTGCC 




GACATATCTG 
GCCTGGATGA 
ATCTGGAAGG 
CTGACCCTGG 
TCAGAAATCA 
CAGAGCTGCT 
AAGTTGGACT 
GCCACCAAGC 
GAC6ACAACT 
GCCGGCCAGA 
CTGCTGCCC38 
CCAAACCTG6 
TG0G6CCGCG 
GTGCTGGTGG 
CTGCCCATCG 
AAGCCCCTGG 
06GACCA6CA 
GTGCCATTOG 
GACACCTTCT 
GAGATCCGCT 
TCCCAGAGCr 
CCTGAGTCTA 
CTCCCCCAGC 
AAA6CAAGGT 
ACATGCCAGG 
CAGG6TTGGC 
TTCTTAGGGG 
CCCTCTAGGC 
CCCCAGGGAT 
ACCCTGCACT 
CTGCACAGTG 
TTATGTGAAA 
GTCGGGGGGG 
ACACGCCTGC 
AGTATTCCAG 
GCXSkACTCCT 
TTCTGTCCTT 
AAGCACTCAT 



PCT/US02/12476 



AGCTCATCCT 
AGGATGAACC 
AACCXOTTGG 
TGCTCCTGGT 
GCCAGGGGAC 
TTGCCGT6GT 
ACATCrrCTT 
ACCTGACGCC 
GCGACCCCCA 
CCTGCGTGGC 
CCCTGCAGAG 
GCC6CATCAT 
TGGCCATTGQ 
ACGTGCAGGA 
TGAACGTGCA 
CCCTGTAC6C 
GCQQCA6CTT 
GGG6AGTC6G 
CCCACCACCG 
ACCCACCCTA 
GCACCCTCCT 
60CATGAG6G 
CTCA6GTT6C 
CTTGCTTCTA 
TGTCCTCACT 
CAGGCCCAGT 
CATCAGCCCT 
ACACQOQCAC 
CCTCTCACAT 

cacccacagc 
ttagtggg;^ 

GTTGCTGGAA 
CACATAGAAG 
ATGTAAGACC 
ATGAGCTGCA 
GCGATCAGCT 
TAAAAOGTTC 
AGCCCAOATA 



MKDEPRSTNL 
ISQGTKKVLA 
KHLTPVTLEL 
PALQSTITRP 
VDVQETEPVM 



RypPYTDHHQ 



11 

I 

PMKLDSVFIW 
EVLPQYLDQS 
GGKNPCYVDD 
YGDDPQSSEN 
QSBIPGPZLP 
FTYZSLIiSVP 
QLXiRHGMGSQ 



21 

I 

KEPFGLVLII 
CFAWLGGPQ 
NCDPQTVANR 
LGRZmQXQP 
IVNVQSVDEA 
P6GVGES04G 
8CTU. 



31 
I 

APimYPLNLT 
ETGQLLEHKL 
VAWFCYPNAG 
QRIiRAUJCCG 
IKFIBROBKP 
RYIK3KPTFDT 



41 



51 
I 



LVLLVGTLPA GNCWLKPSE 
DYIFFTGSPR VGKIVMTAAT 
QTCVAPDYVIt CSPEMQERLli 
RVAZG6QSHB SDRYIAPTVL 
IiALYAPSirSR QWNQMLBRT 
FSRHRTCLLA PSGLEKLKEI 



Seq ID NO: 112 DNA sequence 
Nucleic Acid Accession ft: NM_0044S6 
Coding sequence: 56-2298 ~ 



1 

I 

GAATTCGGGG 
6GCCA6ACT6 
GABTACATGC 
TTTAGTTCCA 
CAGCGAAGGA 
GAGTGTTCGG 
AATGCAGTTG 
GTGGAAGATG 
6ATGGTACTT 
GAATGTGGGT 
AATGATGATG 
GATCTGGAGG 
AAAATTTTGG 
GAAAAATATA 
CCCAACATAG 
CATACGCTTT 
ACACCCAACA 
CCACAGTGTT 
CGGATAAAGA 
A6TAGCAGGC 
AGGGAAGC3U3 
GATGAAACTT 
CCAAATATTG 
GTCCTCATTG 
ACATGTAGAC 
GCTGA6GATG 
CACTGCAGAA 



11 

I 

CGACGCGCGG 
GGAAGAAATC 
GACIGAGACA 
ATCGTCAGAA 
TACAGCCTGT 
TGACCAGTGA 
CTTCAGTACC 
AAACTGTTTT 
TCATT6AAGA 
TTATAAATGA 
ACGATGATGA 
ATCACCX3AGA 
AGGCCATTTC 
AAGAACTCAC 
ATGGACCAAA 
TCTGTAGGCG 
CTTATAAGOG 
ACCAGCATTT 
CCCCACCAAA 
CCA6CACCCC 
GGACTGAAAC 
CGAGCTCCTC 
AACCTCCTGA 
GCACTTACTA 
AGGTGTATGA 
T66ATACTCC 
AQATACAGCT 



21 

I 

GAACAAOGCG 
TGAGAAGG6A 
G CTCftAG AGG 
AATTTT6GAA 
GCACATCCTG 
CTTGGATTTT 
CATAATGTAT 
ACATAACATT 
ACTAATAAAA 
TGAAATTTTT 
TGATGGAGAC 
TOATAAAGAA 
CTCAATQTTT 
COAACAOGAG 
TGCTAAATCT 
ATGTTTTAAA 
GAAGAACACA 
GGAGGGAGCA 
ACGTGCAGGA 
CACCATTAAT 
GGGGGGAGAG 
TGAAGCAAAT 
GAATGTGGAG 
TQACAATTTC 
GTTTAGAGTC 
TCCAAGGAAA 
GAAAAAGGAC 



31 

1 

AGTCGGCGCG 
CCAGTTTGTT 
TTCAGAOGAG 
AGAACGGAAA 
ACTTCTGTGA 
CCAACACAAG 
TCTTGGTCTC 
CCTTATATGG 
AATTATGATG 
GTGGAGTTGG 
GATCCTGAAG 
A6CCGCCCAC 
OCAGATAAGG 
CTCCCAG606 
GTTCAGAGAG 
TATGACTGCT 
GAAAC AGCT C 
AAGGAGTTTG 
GGCCGCAGAA 
GTGCTGGAAT 
AACAATGATA 
TCTGGGTGTC 
TGGAGTGGT6 
TGTGCCATTG 
AAAGAATCTA 
AAGAAGAGGA 
GGCTCCTCTA 



41 

I 

CGGGACGAA6 
GGC6GAA6GG 
CTGATGAAGT 
TCTTAAACCA 
GCTCATTGCJQ 
TCATCCCATT 
CCCTACAGCA 
GAGATGAAOT 
GGAAAGTACA 
TGAATGCCCT 
AAAGA6AAGA 
CTOGGAAATT 
GCACAGCAGA 
CACTTCCTCC 
AGCAAAGCTT 
TCCTACATCC 
TAGACAACAA 
CTGCTGCTCT 
GAGGAOGGCT 
CAAAGGATAC 
AAGAAGAAGA 
AAACACCAAT 
CTGAAGCCTC 
CTAGGTTAAT 
GCATCATAGC 
AACAGC6GTT 
AGCATGTTTA 



51 
1 

AATAATCATG 
T6TAAAATCA 
AAAOAGTATG 
A6AATGGAAA 
GGGGACTAGG 
AAAGACTCTG 
GAATTTTAT6 
TTTAGATCA6 
CGGGGATAGA 
TGGTCAATAT 
AAAGCAGAAA 
TCCTTCTGAT 
AGAACTAAA6 
TGAATOTACC 
ACACTCCTTT 
TTTTCATGCA 
ACCTT6TGGA 
CACGGCTGAG 
TCCCAATAAC 
AGACAGTGAT 
AGAGAAGAAA 
AAAGATGAAG 
AATGTTTAGA 
TGGGACCAAA 
TCCAGCTCCC 
GTGGGCTGCA 
CAACTATCAA 



360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 

leoo 

1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 



60 
120 
180 
240 
300 
360 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 



231 



wo 02/086443 

CCCTGTGATC ATCCACGGCA GCCTTOTGAC 
TTTT6TGAAA AGTTTTOTCA ATdTAOTTCA 
TGCAAAGOVC AGT6CAACAC CAA6CA0TGC 
CCTGACCrCT GTCTTACTTG TGGAGCCGCT 
AAGAACTGCA OTATTCAGCG GGQCTCCAAA 
6CAG6CIG<3G GGATTTTTAT CAAAGATCCT 
TGTGGAQAQA TTATTTCTCA AOATGMGCT 
ATGTGCA6CT TTCTGTTCAA GTTGAACAAT 
AACAAAATTC GTTTTGCARA TCATTCGGTA 
GTTAACGGTG ATCACAGGAT AGGTATTTTT 
CrGmGTTO ATTACAGATA CAGCCAG6CT 
GAAATGQAAA TCCCTTGACA TCIGCTACCT 
CTTCftGSAAC CT06AGTACT GTGGGCAATT 
AATTTGCAAA GTACTOTAAG AATAATTTAT 
GCCTTCrCAC CAGCTGCAAA GTGTTTTGTA 
TACATTTTTC AACTTTGAAT AAA3AATACT 



AGTTCQTGCC CTTGTGTGAT AGCACAAAAT 16 BO 

GAOTOTCAAA ACCQCTTTGC GGGATGCOGC 1740 

CCGPrGCTACC TG6CTGTC06 AGAGTGTGAC 1800 

QACCATT6GG ACA6TAAAAA TGTGTCCTGC 1860 

AAGCATCTAT TGCTGGCACC ATCTGAOGTG 1920 

GTGCAGAAAA ATGAATTCAT CTCAGAATAC 1980 

6ACA6AAGA0 GGAAAGTGTA TGATAAATAC 2040 

6ATTTTGTGG TGGATGGAAC CCGCAA6G6T 2100 

AATCCAAACT GCTATGCAAA AffTTATGATG 2Xfi0 

GCCAAGAGAG CCATCC7VGAC TGGCGAA6AG 2220 

GATGCCCTGA AQTATGTCGG CATCGAAAGA 22 BO 

CCTCCCCCrC CTCTGAJUICA GCTGCCTTAG 2340 

TAGAAAAAQA ACATGCASTT TGA AATT CTG 2400 

AGTAATGAGT TTAAAAATCA ACTTTTTATT 2460 

CCAGTGAATT TTTGCAATAA TGCA6TATG0 2520 
TGAACTTGAA AAAAAAAAAA AAAAAA 



Seq ID NO: 113 Protein sequence t 
Protein Acceasion #> HF_004447 

1 11 21 31 41 51 

I I i i } I 

MGQTGKKSHK GPVCWRKRVK SEYMRLRQLK RFHSADEVK8 MFSSNRQKIL ERTBILNQEN 60 

KQRRIQPVHI LTSVSSLRGT RECSVTSDLD FPTQVIPLKT LNAVASVPIM YSWSPLQQNF 120 

MVEDETVLHN IPYMGDEVLD QDGTPZEELI XNYDGKVHGD RBOQFINDEI PVELVNALGQ 180 

YNDDDDDDDG DDPEBREEKQ KDLEDHRDDK ESRPPRKPPS DKILEAISSM PPDKGTAEBL 240 

KEKYKELTEQ QLPGALPPEC TPNIDGPNAK SVQREQSLHS PHTLFCRRCP KTOCFLHPPH 300 

ATPNTYKRIQI TETALDKKPC GPQCYQHItEG AKEFAAALTA ERIKTPPKRP GGRSRGRLFN 360 

MSSRP8TPTI NVLESXDTDS DREAGTETGG ENNDKBEEBK XDETSSSSBA N8RCQTPIKH 420 

KFUIEPPENV EHSGAEASMF RVLIGTVYDN FCAIARLX6T ICrCRQVYEFR VKBS8IIAPA 480 

PAEDVDTPPR KKKRKHRLWA AHCRKIQLKK DGSSNHWNY QPCTHPRQPC DSSCPCVIAQ 540 

NFCEKFOQCS SEC3QKRFPGC RCaCAQCNTKQ CPCYLAVRBC DPDLCLTCGA ADHWDSKNVS 600 

CKNCSZQRGS KKHLLLAPSD VAGWGIFIRD PVQKNEFZSE YCGBIISQDE ADRRGKVYDK 660 

YMCSFLFKLN NDFWDATRK GNKIRFANHS VNraiCYAKVM MVNGDHRZGI FAKRAIQTGB 720 
ELFVOyRYSQ ADALIOrVGIB RaffilP 

Seq ID NO I 114 DNA sequence 
Nucleic Acid Accession ft: NM_00ie27 
coding sequence: 96-335 

1 11 21 31 41 SI 

i 1 I I ) I 

AGTCTCGGGC GAGTTGTTGC CTGGGCTGGA CGTGGTTTTG TCTGCTGCGC CCGCTCTTOG 60 

CQCTCTCGTT TCATTTTCTG CA6G6CGCCA CQAGGATGGC CCACAA3CAG ATCTACTACT 120 

CGGACAAGTA CTTOQAGOAA GACTAOQAOT AC0G6CAT6T TATGTTACCC AGAGAACTTT 180 

CCAAACAAGT ACCTAAAACT CATCTGATGT CTGAAGAGGA GTG6AGGAQA CTTGGTGTCC 240 

AACAGAGTCT AGGCTGGGTT CATTACATGA TTCATGAGCC AGAACCACAT ATTCTTCTCT 300 

TTAGAQGACC TCTTCCAAAA GATCAACAAA AATGAAGTTT ATCTGQGGAT OGTCAAATCT 360 

TTTTCAAATT TAATGTATAT GTGTATATAA GQTAGTATTC AGTGAATACT TGAGAAATGT 420 

ACAAATCTTT CATOCATACC TGTGCATQAO CTGTATTCTT CACAGCAACA GAGCTCA6TT 480 

AAATGCAACT GCAAGTAGGT TACTGTAAGA TGTTTAAGAT AAAAGTTCTT CC3U3TCAGTT 540 

TTTCTCTTAA GTGCCTGTTT GAOTTTACTO AAACAGTTTA CTTTTGTTCA ATAAAGmG 600 
TATGTTGCAT TTAAAAAAAA AAAAAAA 

Seq ID NO I 115 Protein sequence: 
Protein Accession ffi NP_001B18 

1 11 21 31 41 51 

I 11 I I I 

MAHKQIYYSD KYFDEHYBYH HVMtiPRBtiSK QVPKTHLMSE EEHRRL6VQQ SLGWVHYMIH 60 

EPEPHILLFR RFLPKDQQK 

Seq ID NO: 116 DNA sequence 

Nucleic Acid Accession ft: CAT cluster 

1 11 21 31 41 51 

I I I I I I 

TCAGACCTCA TGAGTCACTT GGACTCTTGA GCCACCTCTG GGGGTG6AGT CTCTCTCCTG 60 

GCATCTGGAC CCTTQGTQCT ATCGACGAAG CTTGGGTGGG GCTCTTAGCT GCTATGTQCA 120 

AGAGGTGTGT TCCAGGGAAA GCCCCTATCT CTCTGCAGAG GTCAAGTGAA AGCGACGGCC 180 

GCAGCCAACA GAG7TCAAAA TGCAGGCTTG GAAAGTACAO GGG6CTCTGT GGAGOATGGG 240 

AAGGACTGAT CCACATTCCC ACGAGGAAGT TTAGCAGAAC CCCOGCGTGC CAACTGOACC 300 

CCTTGGAAG6 ACCTGGCTCA GGCTGGACCA CCTCTT6AGA QGQAGGAGCT CT6GATTTGA 360 

TCAAGAATTC TTTGCTGAGC AT6GTGCCTC ATGCCTATAA TACCAACACT TTGGGAG6CC 420 

AGT61X3GGAG GATCTCTTGA GCCCAGGAGT TCAAGACTAG CCTGGGCAAC ACAGAGA6AA 480 

CCCATCTCTA AAATAATAAT AATAATAAAA TAAAAAATTA GCAGGGCATG GTG6CATGTG 540 

GCTGTAGTTC CAGCTACCCA GGAGaCTGAG GCAAOAGGAT GGCIGGA6CC TGG6ATGTTG 600 

AGGCT6CAAT GAACTOTGAT TACCCCACTG CACT0CA6CC TGGGCAAAAG AG06A6AGAA 660 

CCTQTCTCAA ATAATAATAA TAATAATAAT CTTATTTTGG AGAATAAAGA GACCTCTGGA 720 

TTTGAGGTGC CATTTGGGTA GAAAGAAAAG AOGTTTACAC CGAGAAATAO TCTGTGTTGC 780 

CCTGAAGGAG CA6AGGGATG CAT06CTGGA GGTGACCTAC AGTTGAAGAA GACTCATTAT 840 

GACAGACCTT GTCCTTCTTC CTTGTGGAAA GTGTTTCCTC TGCE6CTACT GC TCAT OAOA 900 

CTCTTCCCCC TCCCTGTOCC AGG6AACCAA AGG6CTTTCT ACCACACCCT TTCTT6CCGC 960 

CCGCCTCCCA TGTCTGCTGT GCCTTTGTAC TCAGCAATTC TTGTTTGCTC CATTATCTTC 1020 

CAGCCG6ATA CAGAGTGAAT AGTTAACCAC ACTTAGGTCA AATAGGATCT AAATTTTTGT 1080 

TCCTGCTCOG TGTAAA6AGG CCAGTGTTTG TGTGTTGCAA GCAGCCTTGQ AATAGTAACT 1140 



232 



wo 02/086443 

CTTCTCATTT GTTTGQGATC TGGCCACCAA GTTCCAGAAT GATACACX5GA TCAGTGCAGA 1200 

AGTTCATCAG GCTCTCGGAC CTTAGGGCTG TTGGAGAAGO CTTCA6CAGC AGAACTGATG 12fiO 

GTGAAGGCTC GTGTTCTCCA TCCTCAACTT TCTTTGCTTC QATCATACAC AAGAATACAT 1320 

TTGGAAGGGC AAAAAATGAA CACT6TC6TT CATTGCAGCC GTGTTTTGTG ACACAGATGC 1380 

ACAGTCTGCT GTGAAGACCT TCrCTCAAQT GGCATTTGQQ AGTCCATGCC AlSATCATGGT 1440 

GCTTCATGAG AGACTGACAG CTATCAGGGG TTGTGGCACT TA6TGAGGAC TCTCCTCCCC 1500 

CAGTGTGTGC TGATGACACA TACACACCTG ACAATAGCTT QAGTCTTCTC TGTTCCTTTT 1560 

ACTCT6TAGC CAACATACAC ATGATTTAAA ACCCTTTCTA AATATCTATC ATGGTTCATC 1620 

CTT6TCCAAA TGCAQA6TCA GAGCTATTT6 TACTTCATTA TTATTTCCAA GGOSAATAiGT 1680 

TG6CTTTCTT TTTGCAAAAA TAATTAAAGT TTTT61ATGT TGCAAAAAAA AAAAAAAAAA 1740 
AAACAAAAAA 

Seq ID NO: 117 ONA sequence 

Nucleic Acid Accession BC012178.1 

Coding eequeneei 204-2285 

1 11 21 31 41 51 

I I I I I I 

CTTCTCTCCC GG6GGGCTGG G6CC0GOQCT CCX3CTGCTGT TGCTCCATTC GG06CTTTTC 60 

TGGCGGCTOG CTCCTCTCCa CTQCGOQCTG CTCCTOGACC AGOCCTCCTT CTCAACCTCA 120 

GCOCGCGGCXS COGACCCTTC CGGCACCCTC CXX3CCCCGTC TOGTACTGTC GCCXSTCACOQ 180 

CCGCX3GCTCC GGCCCTOGCC CCGATGGCTC TGTGCAAOGG AGACTCCAAG CTGGAGAATG 240 

CTGGAGGAGA CCTTAAGGAT GGCCACCACC ACTATGAAGG AGCTGTTGTC ATTCTGGATG 300 

CT GGTGC TCA GTACXSGGAAA GTCATAGACC GAAGAGTGAG GGAACTGTTC GT6CA6TCTG 360 

AAMTTTCCC CTTGGAAACA CCAGCATTTG CTATAAAGGA ACAAGGATTC CQTOCTATTA 420 

TCATCTCTGO AGGACCTAAT TCTGTGTATG CTGAAGATGC TCXXrrGGTTT GATCCAGCAA 480 

TATTCACTAT TGOCS^GCCT GTTCTTGGAA TTT6CTATG6 TATCCy^GATG ATGAATAAGG 540 

TATTTQGAGG TACTGTGCAC AAAAAAA6TG TCAGAGAAGA TGGAGTTTTC AACATTAGTG 600 

T6GATAATAC ATGTTCATTA TTCAGGGGCC TTCAGAAGGA AGAA6TT6TT TTGCTTACAC 660 

ATGGAOATAG TGTAGACAAA GTAGCIOATG GATTCAAaGT TGTGGCACGT TCTGGAAACA 720 

TACrrAGCAOG CATAGCAAAT GAATCTAAAA AGTTATATG6 AGCACAOTTC CACCCTGAAG 780 

TTGGCCTTAC A6AAAATGGA AAAGTAATAC TGAAGAATTT CCTTTATGAT ATAGCTGGAT 840 

GCAGTGGAAC CTTCACCGTG CAGAACAGAG AACTTGAGTG TATTCGAGAG ATCAAAGAGA 900 

GAGTAGGCAC GTCAAAAGTT TTGGTTTTAC TCAGTGGTGG AGTAGACTCA ACAGTTTGTA 960 

CAGCTTTGCT AAATCGTGCT TTQAACCAAG AACAAGTCAT TGCTGTGCAC ATTGATAAIG 1020 

GCTTTATGAG AAAACGAGAA AGCCAGTCTG TTGAAGAGGC CCTCAAAAAO CTTGGftATTC 1080 

AGGTCAAAGT GATAAATGCT GCTCATTCTT TCTACAATGG AACAACAACC CTACX3ATAT 1140 

CAGATGAAGA TAGAACCCCA CGGAAAAGAA TTAGCAAAAC GTTAAATATG ACCACAAGTC 1200 

CTGAAGAGAA AA8AAAAATC ATTGGGGATA CTTTTGTTAA GATTGCCAAT 6AAGTAATT6 1260 

GAQAAATGAA CTTGAAACCA GflGGAQGTTT TCCTTGCCCA AGGTACTTTA OGGCCTGATC 1320 

TAATTGAAAG TGCATCCCTT GTTGCAAGTG GCAAAGCTGA ACTCATCAAA ACCCATCACA 1380 

ATQACA CAGA GCTCATCAGA AAGTTGAGAG AG6AGGGAAA AGTAATAGAA CCTCTGAAAG 1440 

ATTTTGATAA AGATGAAGTG AGAATTTTGG GCA6AGAACT TGGACTTCCA 6AAGAGTTA6 1500 

TTTCCAiSGCA TCCATTTCCA GGTCCTGGCC TGQCAATCAG AGTAATATGT GCTGAAGAAC 1560 

CTTATATTTG TAAGQACTTT CCTGAAACCA ACAATATTTT GAAAATAGTA OCTGATTTTT 1620 

CTQCAAGTGT TAAAAAGCCA CATACCCTAT TACAGAGA6T CAAA6CCT6C ACAACAGAAG 1680 

AGGATCAGGA GAAGCTGATG CAAATTACCA GTCTGCATTC ACTGAAT6CC TTCTTGCTGC 1740 

CAATTAAAAC TGTAGGTGTG CAGQ8T6ACT 6TCGTTCCTA CAGITACGTG TGTGGAATCT 1800 

CCAGTAAAOA TGAACCTGAC TGGOAATCAC TTATTTTTCT GGCTAGGCTT ATACXTTCSGCA 1860 

TGTGTCACAA CGTTAACAGA GTTGTTTATA TATTTGGCCC ACCAGTTAAA QAACCTCCTA 1920 

CAGATGTTAC TCCCACTTTC TTGACAACAG GGGTGCTCA6 TACTTTACGC CAAGCTGATT 1980 

TTGAGGCCCA TAACATTCTC AGGGAGTCTG GGTATGCTGG GAAAATCAGC GA6ATGCCX3G 2040 

TGATTTTGAC AOCATTACAT TTT6AT00GG ACCCACTTCA AAAGCAOCGT TCATOCCAGA 2100 

GATCTGTGGT TATTCGAACX: TTTATTACTA GTGACTTCaT GACTGGTATA 0CTOCAAC31C 2160 

CTGGCAATGA GATCCCTGTA GAGGTG6TAT TAAASATGGT CACTGAfiATT AAGAAGATTC 2220 

CTGGTATTTC TCGAATTATG TATGACTTAA GATCAAAGCC C0CAG6AACT ACT6AGT6GG 2260 
AGTAATAAAC TTCTTGTTCT ATTAAAA 



Sag xo NO I 118 Protein sequence: 
Protein Accession #: AAE12178.1 

1 11 21 31 41 SI 

I i I I I I 

MALCNGDSKL EHAGGDLKDG HHHVEGAWI XiDAGAQYGKV ZDSRVRELFV QSEIFPIiBTP 60 

AFAIKEQGFR AIIISGGPNS VYAEDAPWFD PAIFTIGKPV LGICYGMOMM NKVFGGTVHK 120 

XSVREDGVFN ZSVDNTCSLP RGIiQKBBWL LTK6DSVDKV ADGFKWARS GtTIVAGIANE 180 

SKKLYGAQFH PEVGLTENGK VILKNPI.YDI AGCSGTPTVQ NRELBCIREI KBRVGTSKVL 240 

VLLSGGVDST VCTALLNRAL NQEQVIAVHI DNGPMRKRES QSVEBAUOa GIQVKVINAA 300 

HSFYNGTTTL PISDEDRTPR KRXSICrLNMT TSPSBKRKII GDTFVKIANE VZGEMNLKPB 360 

EVFLAQGTLR PCLXESASLV ASGiCAEIiIKr HBNDTELZRK LREEGKVIEP UKDPHKDBVR 420 

XLGRELGIiPB ELVSRHPPPG P6LAXRVICA EEPYICKDFP ETNNXIiKIVA DFSASVKKPH 480 

TLIjQRVKACT TEEDQBKLMQ ITSLHSLNAF LLPIKTVGVQ GDCRSYSYVC GISSKDEPDW 540 

BSLIPLARLI PRMCHNVNRV VYIPGPPVKE PPTDVTPTFI> TTGVLSTLRQ ADFEAHN1I,R 600 

ESGYA6KXSQ MPVILTPLHF DRDPIiQKQPS CQRSWIRTF XT8DFMTG1P ATPGNEXPVB 660 
WltlOWTEIK XIPGXSRIHY DLTSKPPSTT ENE 

Seq XD KO: 119 DKA sequence 

Nucleic Acid Accession #: im.006500.1 

Coding sequence: 27.. 1967 " 

1 11 21 31 41 51 

i 11 I I I 

ACTTGCGTCT OGCCCTCCGG CCAAGCATGG GGCTTCCCAG GCTGGTCTGC GCCTTCTTGC 60 
TCGCCGCCTG CTGCTGCTGT CCTCGC3GTCG CGGGTGTGCC C3GGAGAGGCT GA6CA6CCTG 120 
CGCCTGA6CT GGTGGAGOTO GAAGTGGGCA 6CACAGCCXT TCTGAAGTGC GGCCTCTCXX: 180 
AGTCCCAA06 CAACCTCAGC CATGTCGACT GGTTTTCrOT CCACAAGQAg AAGOSOACXZC 240 
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TCATCTTCCG TGTGCGCCAG GGCXaGGGCC AGAGC3GAACC TGGGGAOTAC OaOCAGOQGC 300 

TCRGOCTCCA GGACAGAGGG GCTACTCTGG CCCTGACTCA AGTCACCCCC CAAGACGAGC 360 

GCATCTTCTT GTGCCAGGGC AAGCGCCCTC GGTCCCAGOA GTACCGCATC CAGCTCCGCG 420 

TCTAC3UUM3C TCCXSGAQOAG CCAAACATCC AGGTCAACCC CCTGGGCATC CCT6TGAACA 480 

GTAAGGAGCC TGAGQAGGTC GCTACCTGTO TAGGGAGGRA C3G0GTACC0C ATTOCTCAAG 540 

TCATCTGGTA CAAGAATG6C OGGCCTCTGA AG6AGGAGAA GAACCGGGTC CAC»TTC3«5T 600 

CQTCCCROAC TGTGGAGTCG AGTGGTTTGT ACACCTTGCA GAGTATTCTG AAGGCACAGC 660 

TGGTTAAAGA AGACAAAGAT GCCCAGTTTT ACTGTGAGCT CAACTACXX3G CTGCCCAGTG 720 

GGAACCACAT GAAGGAGTOC AGGGAAGTCA CCGTCCCTGT TTTCTACXXB ACAGAAAAAG 780 

TGTQGCTGGA AGTG6AGCCC GTGGGAATGC TGAA6GAAGG GGftCXJQOGTG GAAATCAGQT 840 

QTTTGGCTGA TGGCAACCCT CCACCACACT TCAGCATCAG CAAGCRGAAC CCCAOCAOCA 900 

GG6AG6CA6A GGAAGAGACA ACCAACOACA ACGGGGTCCT GGTGCTGQAO CCT0CCCX3QA 960 

AGGAACaCAQ TGGGCGCTAT GAATGTCAGG CCTGGAACTT G6ACACCATG ATATCGCTGC 1020 

TGAGTQAACC ACAGGAACTA CTGGTGAACT ATGTGTCTGA C6TCCX3AGTG AGTCCCGCAG 1080 

CCCCTGAGAG ACAG6AAGGC AGOUSCCTCA CCCTGACXTTG TQAGGCWaAG AOTAGCCAGG 1140 

ACCTCGAGTT CCACTGGCTG AGAGAAGAGA CRGACCAGGT GCTQGAAAGG QGGCCTGT6C 1200 

TTCAGTTGCA TGACCTGAAA CGGGAGGCAG GAGGCGGCTA TCGCTGC36TG GC3GTCTGTGC 1260 

CCAGCATACC CGGCCTGAAC CGCACACAGC TQGTCAAGCT GGCCATTTTT GGCCCCCCTT 1320 

GGATGGCATT CAAGGAGAGG AAGGTGTGGG TGAAAGAGAA TATGGTGTTG AATCTGTCTT 1380 

GTGAAGCGTC AGGGCACCXX: CGOCCCACCA TCTCCTGGAA CGTC3UWX3GC ACGGCAAGT6 1440 

AAC3UIGACCA AGATCCACaO C6AQTCCT6A 6CA0CCTGAA TGTCCTCGTQ ACCCCGGAGC 1500 

TGTTGGAGAC AGQTOTTOAA TGC3VCGGCCT CCAACXSACXTT GGGCAAAAAC ACCAGCATCC 1560 

TCTTCCTQQA GCTGGTCAAT TTAACCACCC TCACACCAGA CTCCAACACA ACCACTGQCC 1620 

TCAOCACTTC CACTGCCAGT CCTCATACCA GAGCCAACAG CACCTCCACA GAGAGAAAGC 16 BO 

TOCOGOAGCC G6AGAGCCX3G GGOGTGGTCA TCX3TG6CTQT GATTGTOTGC ATCCTGOTCC 1740. 

TQGCGGrGCT GGGOGCTGTC CTCTATTTCC TCTATAAGAA GGGCAAGCTG CCGT6CAGGC 1800 

GCTCAGGGAA GCAGQA6ATC ACX3CTG0CCC CGTCTOSTAA GACOSAACTT GTAGTTGAAG 1860 

TTAAGTCAGA TAAfiCTCCCA GAAGAGATGG QCCTCCTGCA GGGCAGCAGC GGTGACAAGA 1920 

GGGCTCCGGG AGACCAGGGA GAGAAATACA TCGATCTGAG GCATTAGCCC CX3AATCACTT 1980 

CAGCTCCCTT CCCTGCCTGG ACCATTOCCA OCTCCCTGCT CACTCTTCTC T CAGC CAAAG 2040 

CCTCCAAAGG GACTAGAGAG AAGCCTCCTG CTCCCCTCAC CTGCACACCC CCTTTCAGAG 2100 

GGCCACTGGG TTAGGACCTG AGGACCTCAC TTGGCCCTGC AAGCCX3CTTT TCAGGGACCA 2160 

GTCXavCCACC ATCTCXTCCA OGTTGAGTGA AGCTCATCCC AAGCAAGGAG CCXXAGTCTC 2220 

CCGftGOGGGT AGGAGAGTTT CTTGCASAAC GTGTTTTTTC TTTACACACA TTATGGCTGT 2280 

AAATACCTGO CTCCTQOCAG CAGCTGAGCT GGGTAGCCTC TCTGAGCTGG TTPCCTGCCC 2340 

CAAAQGCTGO CTTOCACCAT CCAOGTOCAC CACTGAAGTG AGGACACACC GGAGCCAGQC 2400 

GCCTGCTCAT GTTGAAGTGC GCTGTTCACA CCCXSCTCOGG AGAGCACCXTC AGCX^GCATOC 2460 

AGAAGCA6CT GCAGTGTTGC TGCCAOaiCC CTCCTGCTOG CCTCTTCAAA OrCTOCTGTG 2520 

ACATTTTTTC TTTGGTCAGA AGCCAGGAAC TGGTGTCATT CCTTAAAA6A TACXSTGCOGG 2580 

GGCCAGGTGT G6TGGCTCAC GCCTGTAATC CC3M3CACTTT GQGAGGCOGA GGOGGGCGGA 2640 

TCACAAAGTC A6GAC3GA6AC CATCCTGGCT AACAOGGTQA AACCCTGTCT CTACTAAAAA 2700 

TAC3VAAAAAA AATTAGCTAG GCGTAGTGGT TGGCACCTAT AGTCCCAGCT ACTCGGAAGG 2760 

CTGAAGCAGG AGAATGGTAT GAATCCAGQA GGTGGAGCTT GCAGTGAfiCC 6AGACCGIGC 2820 

CACTGCACrC CAGCCTGGGC AACACAGCGA GACTCCGTCT C6AGGAAAAA AAAAGAAAAG 2880 

AOSCGTACCT G0GGTGAG6A AQCTGGGCGC TGTTTTCGAG TTCAGGTGAA TTAGCCTCAA 2940 

TCCCCGTGTT CACTTX5CTCC CATAGCCCTC TTGATGGATC AOGTAAAACT GAAAGGCAGC 3000 

GGGGAGC3W3A CAAAGATGAG GTCTACACTG TGCTTCATGG GGATTAAAGC TATGGTTATA 3060 

TTAGCACCAA ACTTCTACAA ACCAAGCTCA OGGCCCCAAC CCTAGAAG6G CCCAAATQAG 3120 

A6AATGGTAC TTAGGGATGG AAAACGGGGC CTGGCTAGAG CTTCGGGTGT GI GTgT CTGT 3180 

CTGTGTGTAT QCATACATAT GTGTGTATAT ATGGTTTTGT CAGGTGTQTA AATTTGCAAA 3240 

TTGTTTCCTT TATATATGTA TGTATATATA TATATGAAAA TATATATATA TATGAAAAAT 3300 

AAAGCTTAAT TGTCCCAGAA AATCATACAT TCCTTTTTTA TTCT ACATG G G TACCA CAGG 3360 

AACCTGGGG6 CCTGTGAAAC TACAACCAAA AG6CACACAA AACCGTTTCC AOTTCGCAGC 3420 

AGAGATCAGO OGTTACCTCT GCTTCTGAOC AAATOOCTCA AGCTCTAOCA GAGCAGACAG 3480 

CTACCXTTACT TTTCAGCM3C AAAAOOIOCC GIATQACXSCA GCACGAAGGG CCTGGCAGGC 3540 
TGTTAGCAGG AGCTATGTCC CTTCCTATCG TTTGC3GTCCA CTT 

Seq ID NO: 120 Protein sequence: 
Protein Accession ff: NP_006491.1 

1 11 21 31 41 51 

ilGLPRLVCAF iliAACCCCPR VAGVPGEAEQ PAPELVEVEV QSTALLKOGL SQ9QGNLSHV 60 

DWFSVHKBKR TLIFRVRQGQ GQSBPGBYEQ RLSLQDRGAT LALTQVTPQD ERIFLCQGKR 120 

PRSQEYRIQL RVYKAPBBPN IQVNPLGIPV NSKEPEEVAT CVGRNGYPIP QVIWYKMGRP 180 

LKEEKNRVHI Q8SQTVSS86 LYTLOSILKA QXiVKEDKDAQ FYCBLNYRLP SGMHMKESRE 240 

VTVPVFYPTE KVWLGVEPVQ MLXEGDRVSZ ROADGHPPP HFSXSKQNPS TREAEEETTN 300 

DNGVLVLEPA RKEHSGRYEC QAWNLDTMIS LLSEPQELLV NYVSDVRVSP AAPKRQEGSS 360 

LTLTCEAESS QDLEFQWLRE ETDQVLERGP VLQLHDLKRE AGGGYRCVAS VPSIPGLNRT 420 

QLVKLAIPGP PWMAFKERKV WVKENMVLNL SCEASGHPRP TISWNVNGTA SEQDQDPQRV 480 

IiSTUaVLVTP ELZiBTGVECT ASMDW3KNTS ILPLELVNLT TLTPDSNOTf QZiSTSTASPB 540 

TRAIfSTSTER KLPEPESRGV VIVAVIVCIL VLAVLGAVLY PLYKKGKLPC RRSSRQEITL 600 
PPSRKTBLW BVKSDKLPEE MGLLQGSSGD KRAPGDQGEK YIDLRB 

Seq ID NO: 121 DNA sequence 
Nucleic Acid Accession #* NM_0ia306 
Coding sequence: 60-671 

1 11 21 31 41 51 

I 1 I I I ! 

ATAGTCTACA CAGAGCTCCX: CTTGCTGCCC AGACAAGCTG AAGGACCACA GGAAAAGCCA 60 

TGGAGACTTC AGCATCX^'OC TCCCAGCCTC AGGACAACAG TCAAGTCCAC AGAGA AACAG 120 

AAGATGTAGA CTATGGAGAG ACAGATTTCC ACAAGCAAQA 06GGAAGGCT GGACTCTTTT 180 

CCCaAGAACA ATATGAGAQA AACAAGTCTT CTTCCTCCTC CTTCTCTTCC TC CTCATCC T 240 

CCTCATCTTC TTCATCCTCC TCCTCCTCAG GTCCTGGGCA TGGGGAGCCT GACSTTTTQA 300 
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AG6A.TGAGCT TCAACTCTAT GGAGATGCTC CTG6AGAG6T GGTACCCTCT G6GGAATCA6 360 

GACTCCGAAG GAGAGGCTCT GACCCAGCAA GTOQAOAAGT GGAGGCCTCT CAGTTAAGAA 420 

GACTGAATAT AAAGAAAGAT GATGA6TTTT TCCATTTCGT CCTCCTGTGC TTTGCCATCG 480 

GGGCCTTGCT GGTGTGTTAT CACTATTACG CAGACTGGTT CSVrGTCTCTT GGGGTCGGCX: 540 

TQCTCACCTT GQCCTOCCTO GAAA0CX3TTG GCATCTACTT CX36ACTAGTG TACC6TATCC 6O0 

ACAGOGTCCT GC3UU3GCTTC ATCCCCCTCT TCCAGAAGTT TAG6CTGACA GGGTTCA6GA 660 

AGACTGACTG AGGCCACTTC CAGGTQGGCA GCAGAGGCAG GCCCCAGTGT GACCACCACT 720 

GOSACCCCTO AGCCCACAAG GGCAGAGCAG CATTCTGAGA GACGCACAGG AGACCAA6CC 780 

AQACCAATAA ACAGAACACT TTTCCTTCCA TGTGGTCTGA ATGTTG6CAC CAGCCCX3GGC 840 

AGGGQCATCT CATTTG6GCA GTACTGCIGT GCAACCCAGC TGCAA0QAT6 GAAGGCAOAG 900 

G6TGG6T6TG GOGCCTGAGG CTTCACAGTA CCTGGACCAG CAGGAAGATT CTGGGAGGTC 960 

ACTGCTCTCA GAGGACAGCA AGGGACCCT6 AGCTCTQCAA GCTOTGATCT GTCTGGGTTC 1020 

ATGGTTTTTC TCAAATCCCA GGCTATCTGC ATGCX3CTCTC AGGTGCTACC GAGCCATCCT 1080 

GGGAGA6ATG 6AT6GTCCAC TGCTTTGA66 CAGGGAGCCA TCQGGCTGGG GCCCCTTGGT 1140 

GAACCTGAT6 CAGGTAAGAT GCTGAGGACT AAAACCATTT TTTTTGCACC CAAAAAAAAA 1200 

GGCAGGAAAA TGATCATCAG AAACTAAATG GCAGCCA66C ATGGGGGCTC ACGACTGTAA 1260 

TCXrrCGCACT TTGGGAGGCT CAGGCTAAGG GTCGCTTGAA GCTGAGAGTT CAAGACXyUlC 1320 

CTGGGCAACA TAGTGAGACC CCCATCTCTA CAATTTTTTT TTAATGACCA AATGTGG0G6 1380 

TACATACCT6 TACATACCTG C3GGTTCCA6C TACTCAAGAG GCTGAGGCAQ GAGGACTGCT 1440 

TGAGCCCAOG AGTTCAGGGC TGGAGTGAG6 TAOQATCAAG CCACTGCACT CX3U3CCT6GG 1500 
GGACAQA6CA AGATCGTTTC TCTAAAATT 



Seq ID MOs 122 Protein sequence $ 
Protein Accession #: liP_060776 

1 11 21 31 41 51 

i I I I 1 I 

METSASSSQP QDNSQVHRET EDVDYGETDF HKQDGKAGLP SQBQYESHKS SSSSFSSSSS 60 

SSSSSSSSSS GPGRGEPDVL KDELOLYGOA PGEWPSGES GIiRRHGSDFA SGEVEASQIA 120 

RLNIKKDDEF FHFVLLCFAI GAIiIiVCyHYY ADHPMSIiGVO LLTFASLBTV GIYPGLVYRZ 180 
HSVLQGFIPL FQKFRLTGFR KTD 

Seg ID NO: 123 DNA Sequence 
Nucleic Acid Accession #s BC022542 
Coding sequence t 243.. 6 96 

1 11 21 31 41 51 

I i ) I i ] 

ACTTGGTCCC AGCCGATAAA TCTGGGGCAG CX3CGCGGTAG GAGCTQCGGG OGGCCAGGCC 60 

CCTTCCTGCG TCCGCACCTG GCCCCGC6CG CCCCTCTCGG GOGTCOGGCT TCCGGCGTCC 120 

TGGCGGCTGG GGTQGQG6C36 GTTC2GGGGQ6 COQCCrOQCT GCTCCTCGG6 GOQGGSACGG 180 

GGCTCACGCG CGGQCCOGGC AOQGCCTTCA C0QC0G0GO3 CTCTGA06CC GGCATAAGGQ 240 

CCATGTGTTC TGAAATTATT TT6AGGCAAG AA6TTTTGAA AGATGGTTTC CACA8AGACC 300 

TTTTAATCAA AGTGAAGTTT GQGGAAAGCA TTGASGACTT GCACACGTGC CGTCTCTTAA 360 

TTAAACAGGA CATTCCTGCA GGACTTTATG TGGATCCGTA TGAGTTGGCT TCATTACGAG 420 

AGAGAAACAT AACAGAGGCA GTGAIGGTTT CAGAAAATTT TQATATAGAG GCCCCTAACT 480 

ATTTOrCCAA GOAQTCIGAA GTTCTCATTT ATGCCAGAOG AOATTCACAO TGCATTGACT 540 

GTTTTCAAGC CTTTTTGCCT GTGCACT6CC 6CTATCAT06 GCOSCACAGT GAAGATG6AG 600 

AAGCCTCGAT TGTGGTCAAT AACCCAGATT TGTTGATGTT TTGTGACCAA GAGTTCCCGA 660 

TTTTGAAATG CTGGGCTCAC TCAGAAGTGG CAGCCCCTTG TQCTTTGGAT AATCAGGATA 720 

TATGCCAATG GAACAAGATG AAGTATAAAT CAGTATATAA GAATGTGATT CTACAAGTTC 760 

CAOrGGGACT 6ACT6TACAT ACCTCTCTA6 TATQTTCTGT GACTCTOCTC ATTACAATCC 840 

TGTGCTCTAC ATTGATCCTT GTAGCAGTTT TCAAATATGG CCATTTTTCC CTATAAGOT 900 

TATGTAGTTA AATGCTTCCT AGAAACCTAA ATAAGATCTA TTAATTTCTG ACGAGAGGTG 960 

TTCTTCTAGA ATTAATTACT TTTATCTTTT GTCTTCATTT GTGGCCAAAA TTAT6TTTAC 1020 

TAGAG6AAAT TTGG6ATCAT TCTCAGCTAA TTCX3UVAATG TAGTGCTCTA TTGCATGGAT 10 80 

CCTTGGTAAT CCTCMGCAT CAQATGCCAT AAGGGGAAAC TTAATTCTGC TAAATTAATG 1140 

TTTATTTTGT GAGAAGTGAC TTTATCTTCA TTTGGGGTAG AAAAATTATT TCTTTAT6TA 1200 

GTAGAGACAA ATTATTCTCA TTTTGCAAGT ACTTTCAATT TAAGCTACAA ATTGAGAAAA 1260 

COGTTATAAA TAAGAATAAA ATAGGCCAGG CACAGTGGCT CACACCTGTA ATCCCAQCAC 1320 

TTTGGGAGGC CGAGGTGGGC GGATCACCAG AGGTCAAGA6 TTTGAGACCA GCTTGGTGAA 1360 

ACCCT6TCTC TTICTAAAAAT AGAAAAOTTA GCTGG6GCT6 GTG0TGG6CA TCTGTAGTCC 1440 

CAGCTAATTG GAAGGGTGA6 GOGGGAGGAT CGCTTGAACC T6G6AGGC06 AG6TTCCAGA 1500 

GA6CCAAGAT CGCACCACTG CACTACAGCC T6G6CX3ACAG AACGAGACCX: TGTCTCCAAA 1560 

GGAAAAACAA AAAAGAAGAA TAAAATAATT TGGATGAAAA TCATGTTTAT TTAAATAGTA 1620 

ATGTCATGAG ACTATTAAAG ATGTGCCAGA GTTTCAATGA AAATCATTAA AGTAGGACA6 1680 

CTAAGAAATT AATATTAATA TAAAAATTAT TGATAATCTT AAATTATTGA TTATTCCTTA 1740 

ACGCACTOCA TTCTCCTTTT ACATTTTATC ATGTTTCTTT TGAATATATG AATTGGCAAA 1800 

GQACTT6AT6 AAACT6AGTA CTAAGATTTG GTACAGAGTA TGTCAGGAAG ACAACTCAGA 1860 
TTGCCATTTT AAATAAAGTT OTACATaAAC AAAAAAAAAA AAAAAA 



Seq ID NO: 124 Protein sequence: 
Protein Accession #: AAU22542 

1 11 21 31 41 51 

I I I I I 1 

MCSEIILRQE VLKDGPHRDL LIKVKPGESI EDLHTCRLLI KQDIPAGLYV DPYELASLRE 60 
RNITEAVMVS EKFDIEAFNY LSKESEVLIY ARRDSQCIDC FQAFLPVHCR YHRPHSEDGE 120 
ASIWNNPDL UUFCDQAGSR RHIRFRFOSF DKTXEFPILK CKAESEVAAP CAIiENEDIOQ 180 
WNKHKYKSVY KNVILQVPVG LTVHTSLVCS VTLLITILCS KKKKK 

Seq ID NO: 125 DNA sequence 

Nucleic Acid Accession #: NM_004994.1 

Coding sequence: 20.. 2 143 
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1 11 21 31 41 51 

I I 11 I I 

AGACACCTCr GCCCTCACCA TGAGCCTCTG GCAGCCCXTPO GTCCTGGTQC TCCTGGTQCT 60 

GGGCTGCTOC TTTGCTGCCC CCAGACAGCG CCAGTCCRCC CTTGTGCTCT TCCCTGGAGA 120 

aCSGfiSSMCC MOCKAOaa ACAGGCAOCT GOGAOAGaAA. TACCT6TACC GCTAXGGITA 180 

CACTCGGGTG GCAGAGATOC 6TG6ASU3TC GAAATCTCTG GG6CCTG0GC TGCT6CTTCT 240 

CJCaOAAGCAA CTGTCCCTGC COSAGACOSG TGAGCTGGAT AGCGCCACGC TGAAGGCCAT 300 

QGGAAOCCCA CGQTGCGGGG TCCCAGACCT GGGCAGATTC CAAACCTTTG AGGGOGACCT 360 

CAAOTG6CAC CACCACAACA TCACCTATTG GATCCAAAAC TACT0C3GAAG ACTTGCCGOG 420 

6G0G6T6ATT GAC3GAC36CCT TT6CCG6CX3C CTTCGCACT6 TG6A608GGG TGACGCCGCT 480 

CS^CCTTCSICT OGOGTGTACA 6GC35GGAOGC A6ACATGGTG ATGCA6TTT6 GTGTCaSCGGA 540 

GGAOGOAGAC GGGTATCCCT TCGA06GGAA GGACX3GGCTC CTGGCACACG CCTTTCXTCC 600 

TCGCCCCQGC ATTCAGGGAG ACGCCCATTT CGACGATGAC GAGTTGTGGT CCCTGGGCAA 660 

GGGCGTCGTG GTTCCAACTC GGTTTGGAAA CGCAGATGGC GC6GCCTGCC ACTTCCCCTT 720 

CATCTTCGAG GGCC6CTCC7 ACTCT6CCTG CAOCAOOGAC GGT06CTGG0 A06GCTTGCC 780 

CTGGTGCAGT ACCACGGCCA ACTACGACAC OQAGGACOQG TTTGGCTTCT GCCCCAGCGA 840 

GAGACTCTAC ACCCJGGGACG GCAATGCTGA TGGGAAACCC TGCCAGTTTC CATTCATCTT 900 

CCAAGGCCAA TCCTACTCCG CCTGCACCAC GGACGGTCX3C TCCGACGGCT ACCGCTGGTG 960 

OGCCACCACC GCCAACTACX5 ACCXX3GACAA GCTCTTOGGC TTCTGCCCX3A CCCGAGCTGA 1020 

CTCGAOGGTG ATG0G6GGCA ACTCG6CGG6 GGAGCTGT6C GTCTTCCCCT TCACTTTCCT X080 

OGGTAAQQAG TACTCGACCT GTACCAGCSGA GGGCC3GCGGA GATGGGC6CC TCTGGTGCGC 1140 

TACCACCTOS AACTTTGACA GCGACAAGAA GTGGGGCTTC TGCCXJGOACC AAQGATACAG 1200 

TTTGTTCCTC GTGGCOGCGC ATGAGTTCGG CXACGCGCTG GGCTTAQATC ATTCCTCAGT 1260 

GC0GGAG6CG CTCATGTACC CTATGTACOG CTTCACTGAG GGGCCCCCCT TQCATAAGGA 1320 

OGACGTGAAT GGCATCOGGC ACCTCTATGG TCCTCGCCCT GAACCTQAGC CAOGGCCTCC 1380 

AACCACCACC ACACCCCAGC CCACG6CTCC CCOGACGGTC TGCCCCACCG GACCCCCCAC 1440 

TGTCCAOCCC TCAGAGCX5CC CXACAGCTGG (XCCACAGGT CCCCCCTCAG CTGGCCCCAC 1500 

AGGTCCOCCC ACTGCT6GCC CTTCTACGGC CACTACTGTG CCTTTGAGTC OSGTGGAOGA 1560 

TOCCTGCAAC GTGAACATCT TCGACGCCAT CXSOGGAGATT QGGAACC3W3C TOTATTTGTT 1620 

CAAGGATGGG AAGTACTGGC GATTCTCTGA GGGCAGGGGQ AGCCGGCCGC AGGGCCCCTT 1680 

CCTTATOGCC GACAAGTG6C COSOGCTGCC COGCAAGCTG QACTCGGTCT TTGAGGAGCC 1740 

6CTCTCCAAG AAGCTTTTCT TCTTCTCTGG GCGCCAGGTG TGGGTGTACA CAGGOGOGTC 1800 

GGTGCTGGGC C0GAGGCX5TC TGGACAAGCT GQGCCTGGQA GCCGA0GIG6 CCCAGGTGAC 18610 

CGQGGCCCTC CGGAGTQQCA GGQGQAAGAT GCTGCTGTTC AGCGGGCGGC GCCTCTGGAG 1920 

GTTCXSAOQTG AAGGCGCAGA TGGTGGATCC COQGAGOGCC AGCGAGGTGG AC06GATGTT 1980 

CCCCGGGGTG CCTTTGGACA CX5CACGACGT CTTCXSVGTAC CGAGAGAAAG CCTATTTCTG 2040 

CCAGGACGGC TTCTACTGGC GC6TGAGTTC COSGAGTGAG TT6AACCAG6 TGOACC AAOT 2100 

GGGCTACGTG ACCTATGACA TCCTGCAGTG CCCTGAGGAC TAGOGCTCCC GTCCTOCTTT 2160 

GCAGTGCXZAT GTAAATCCCC ACTGGGACCA ACCCTGGGGA AGGAGCCAGT TTGOOGGATA 2220 

CAAACT66TA TTCTGTTCTG GAG6AAAGGG AGGA6TGGAG GTGGGCTGGG COCTCTCTTC 2280 
TCACCTTTGT TTTTTGTTGG AGTGTTTCTA ATAAACTTGG ATTCTCTAAC CTTT 

Seq ID KOt 126 Protein sequence: 
Protein Accession #i NP_^00498S.l 

1 11 21 31 41 51 

I I I I > . ' . 

MSXMQPLVLV LE.VIiGCCFAA PRQRQ8TLVL FPGDZATNLT DRQLAEE^^Y RYOYTRVAEH 60 

RGBSKSLGPA LLLLQXQLSL PBTGBU)SAT LKAKRTPROG VPDLGRFOTF EGDLRWHHRN 120 

ITYWIQNYSB DLPRAVIDDA FARAPALWSA VrPLTPTRVY SRDADIVIQF GVAEHGDGYP 180 

FDGKDGLLAH AFPPGPGIQG DAHPDDDELW SIiGKGVWPT RPGNADGAAC HPPPIPEGRS 240 

Y8ACETDGRS SGLPWCSTTA HYDTOORFGF CPSERIiYTRD QNADGKPOQF PFIFQGQSYS 300 

ACTTDGRSDG YRKCATTANY OSOKLFGFCP TRADSTVKGG NSAGKLCVFP FTFLGKEYST 360 

CTSBGRCaJGR LWCATTSNPD SDKRNOFCPD QGYSLFLVAA BBF6RALGLD HSSVPEALMY 420 

PMYRPTBGPP LHKDDVNGIR HLYGPRPEPB PRPPTTTTPQ PTAPPTVC3>T GPPTVHPSKR 480 

PTAGPTGPPS AGPTGPPTAG PSTATTVPLS PVDDACNVNI FDAIAEIOIQ LYLPKDGKYW 540 

RFSEGR6SRP QGPFLIADKW PALPRKLDSV FBEPXiSKKLF FF8GRQVWVY TGASVLGPRR 600 

IiDKLGLGADV AQVTGAIiRSG RGKNLZiFSQR RZMRFDVXCAQ KVDFRSASEV DRMFPGVPLD 660 
THDVFQYRBR AYFCX)DRFYN RVSSRSELNQ VDQVGYVTYD IbQCPBD 

Seg ID KO: 127 DNA sequence 
Nucleic Acid Accession #: NM_004181 
Coding sequence: 32-670 

1 11 21 31 41 SI 

i I 1 I t I 

GCAGAAATAG CCTAGGGAGA TCAACCCOGA GA7GCTGAAC AAAGTGCTGT CCCGGCTGGG 60 

GQTGGCOGGC CAGrGGOGCT TCSSTGOACGT GCTGGGGCTG 6AAGAGGAGT CTCTGGGCTC 120 

GGTGCCAG06 CXTTGCCTGOQ OGCTGCTGCT GCT8TTTCCC CTCAC366CCC AGCAT6AGAA 180 

CTTCAGGAAA AAGCAGATTG AAGAGCTGAA GGGACAAGAA 6TTAGTCCTA AAGTGTACTT 240 

CATGAAQCAG ACCATTG6GA ATTCCTGTGG CACAATOGQA CTTATTCACX3 CAQTGGCCAA 300 

TAATCAA6AC AAACTGGGAT TTGAGGATGG ATCAGTTCTG AAACA6TTTC TTTCTGAAAC 360 

AGAGAAAATG TCCCCTGAAG ACAGAGCAAA ATGCTTTGAA AAOAATGAGG CCATACAGGC 420 

A6CCCAT6AT 6C0GTGGCAC AGGAAGGCCA ATGTCGGGTA GATGACAAGG TGAATTTCCA 480 

TTTTATTCTG TTTAACAACG TGGATGGCCA CCTCTATGAA CTTGATGGAC GAATGCCTTT 540 

TCCGGTGAAC CATGGCGCCA GTTCAGAGGA CACCCTGCTG AAQGAOOCTG CCAAGOTQTG 600 

CAGAGAATTC ACXXSAGCGTG AGCAAGGAQA AGTCCX3CTTC TCTGCCOTGG CTCTCT6CAA 660 

GGCAGCCTAA TGCTCTGTGG GAGGGACTTT GCTOATrTCC CCTCTTCCCT TCAACATGAA 720 

AATATATACC CCCCATGCAG TCTAAAATGC TTCAGTACTT GT6AAACACA GCTGTTCTTC 780 

TGTTCTGCRG ACACXKTCTTC CCCTCA6CCA CACCCAGGCA CTTAAGCACA AGCAGA6TGC 840 

ACAGCTGTCC ACTGGGCCAT TGTGOTGTOA OCTTCAGATG GTGAAGCATT CTCCCCAGTG 900 

TATGTCTTGT ATCCGATATC TAACQCTTTA AATGGCTACT TTGGTTTCTG TCTGTAAGTT 960 
AAGACCTTGG ATGTGQTTAT GTTOTCCTAA AGAATAAATT TTGCTGATA6 TAGC 

Seq ID NO: 128 Protein sequences 
Protein Accession #: NP_004172 
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1 11 21 

I I I 

MUIKVLSRLG VAGQWRFVDV LGLEEBSL6S 
GQEVSPKVYP MKQTIGNSCG TIGLIHAVMJ 
CFEKNEAIQA AHDAVAQEGQ CRVDDKVMPH 
TLLKDAAKVC REPTEREQGE VRFSAVALCK 



31 41 51 

I I I 

VPAPACAIiLL LFPLTAQHEN FRKKQIBEUC 60 
NQDKLGFEDG SVLKQFLSET EKMSPEDRAX 120 
FILFiaMVDGH LYBLDGRMPF FVHBGASSED ISO 
Mi 



Seq ID NO: 129 DNA sequence 
Nucleic Acid Accession #: NM_000213 
Coding sequence) 127-5385 

I 11 21 31 41 SI 

C0COCGCOC30 CTQCAGCCCC ATCTCCTAGC GGCAGCCCAG GCGOGGAOOG AGCX3AOTOC30 60 
OCCCGAGSTA GGTCCAGGAC GGGCXSCACAG CAGCAGCCGA GGCTGGCCG6 OAGAGGGAGG 120 
AAGAGGATGG CAGGGCCACG CCCCAGCCCA TGGGCCAGGC TGCTCCTGGC AGCCTTGATC 180 
AGCOTCAGCC TCTCTGGGAC CTTGGCAAAC CGCTGCAAGA AGOCCCCAGT GAAGAGCTGC 240 
ACG6AGTGTG TCCGTGTCQA TAAGGACTGC GCCTACTGCA CAGA0GA6AT GTTCAGGGAC 300 
aSQOSCTGCA ACACGCAGGC GGAGCIGCTG GGQGOGGGCT GCGAGOGGOA OftGCA TCiQTG 360 
GTCATQ6AQA GCAGCTTCCA AATCACAGAG GAGACCCAGA TTOACACCAC CCTGOG GOQC 420 
AGCCAGATGT CCCCCCAAGG CCTGOGGGTC CGTCTGCGGC CCGOTGAGGA GOGGCATTTT 480 
GAGCTGGAGG TGTTTGAGCC ACTGGAGAGC CCCGTGGACC TGTACATCCT CATGGACTTC 540 
TCCAACTCCA TGTCCGATGA TCTGGACAAC CTCAAGAAGA TGG6GCA6AA CCTGGCTCGG 600 
GTCXnmOCC AGCTCACCAG CGACTACACT ATTOGATTTG GCaAGTTTGT GGACAAAGTC 660 
AGOGTOCCGC AOACQGACAT 6AGGCCTGA0 AAGCTGAAGO AOCCCTGGCC CAACAGTGAC 720 
CCCCCCTTCT CXrrrCAAGAA CGTCATCAGC CTGACAGAAG ATGTGGATGA GTTCCGGAAT 780 
AAACTGCAGG GAGAGCGGAT CTCAGGCAAC CTGGATGCTC CTGAGGGCX3G CTTCGATGCC 840 
ATCCTGCAGA CAGCTGTGTG CACGAGGGAC ATTGGCTGGC GCCOGGACAG CACCCACCTG 900 
CTGGTCTTCT CCAOOGAGTC AGCCTTCC»C TATGA66CTG ATGGC3G0CAA CGTGCTGGCT 960 
GGCATCATGA GCOGCAACGA TGAAOGGTCC CACCTGGACA CCAC3GG6CAC CTACACCCAQ 1020 
TACAGGACaC AGGACTACCC GTCGGTGCCC AOCCTGGTGC GCCT6CTCGC CAAGC31CAAC 1080 
ATCATCCCCA TCTTTGCTGT CACCAACTAC TCCTATAGCT ACTACGAGAA OCTTCACACC 1140 
TATTTCCCTO TCTCCTCACT GGGGGTGCTG CAGGAGGACT CGTCCAACAT CGTGQAGCTQ 1200 
CTGGAGGAGO CCTTCAATOG OfcTOCJGCrCC AACCTGGACA TCCQGGCCCT AGACAGCCCC 1260 
CGRGGCCTTC GGACSGAGGT CS^TCTCCAAG ATGTTCCAGA AGACGAGGAC TGGGTCCTTT 1320 
CACATCCGGC GGGGGGAAGT GGGTATATAC CAQGTGCAGC TGCGGGCCCT TGAGCACGTG 1380 
GATGGGACGC ACGTGT6CCA GCTGCXXMAG GACCAGAAGG GCAACATCCA TCTGAAACCT 1440 
TCCTTCTCCG AOSGCCTCAA GATGQAOGOS GGCATCATCT GTGATGTGTG CACCTGCGA6 ISOO 
CIGCAAAAAG AGGTOCGGTC AGCTCGCTGC AGCTTCAACG GAGACTTCGT GTGCGGAC3«3 1560 
TGTGTOTGCA GOSAGGGCTG GAGTGGCCRG ACCTGCAACT GCTCCACCGG CTCTCXGAOT 1S20 
QACATTCAGC CCTGCCTGCG GGAGGGCGAG GACAAGCCGT GCT0CX3GC0G TGGGGAGTGC 1680 
CAGTGCGGGC ACTGTGTGTG CTACGGCGAA GGCGGCTACQ AGGGTCASTT CTGCGAGTAT 1740 
QACAACTTCC AGTGTCCCOS CACTTCCGGG TTCCTCTGCA ATOACCGAQG AOGCKCTCC 1800 
ATOG6CCAGT GTGT6TGTGA GCCTGGTTGG ACAGGCCCAA GCTGTGACTG TCCCCTCAGC 1860 
AATOCCACXrr GCATCGACAG CAATGGGQGC ATCTGTAATG GAOTrGGCCA CTGTGAGTGT 1920 
GGCCGCTGCC ACTGCCACCA GCAflTCGCTC TACACX3GACA CCATCTGCGA GATCAACTAC 1980 
TCGGCGATCC ACCCGGGCCT CTGCGAGSRC CTAGGCTCCT GGGTGGftGTG CC3U3GOGTGG 2040 
GGCACCGGCG AGAAGAAGGQ G06CACGT0T GAGGAATGCA ACTTCAAGGT CAAGATGGTG 2100 
GACGAGCTTA AGAGAGCCGA GGAGQTGGTG GTGOGCTGCT CCTTCOGGGA CGAGGATGAC 2160 
GACTGCACCT ACAGCTACAC CATGGAAGGT GAOSGCXSCCC CTGGGCCCAA CAGCACTGTC 2220 
CTGGTGCACA AGAAGAAGGA CTGCCCTCCQ GGCTCCTTCT GGTGGCTCAT CCCCCTGCTC 2280 
CTCCTCCTCC TGCXSaCTCCT OQCCCTGCTA CTGCTGCIAT GCTGGAAQTA CTG TGCCT GC 2340 
TGCAAGGCCT GCCTGOCACT TCTCCCGTOC TGCAACCQAG QTCACATGGT GGGCTTTAAQ 2400 
GAAGACCACT ACAT6CTGCG GGAGAACCTG ATGGCCTCTG ACCACTTGGA CACGCCCATG 2460 
CTGCX3CAGCG GGAACCTCAA GGGCCGTGAC GTGGTCCGCT GGAAGGTCAC CAACAACATQ 2520 
CAGGGGCCTO GCTTTGCCAC TCATGCCGCC AGCATCAACC CCACAGAGCT GOTGCCCTAC 2580 
QGGCTGTCCT TGCGCCTGOC CCXSCCTPTOC AOOGAGAACC TQCTGAAQCC TGACACTCGG 2640 
GAGTGCGCCC A6CTGCGCCA G<aOGTGGAG QAGAACJCTGA ACSSAGGTCTA CRGGCAGATC 2700 
TCCGGTGTAC ACAAGCTCCA GCAGACC3UVG TTCCGGCAGC AGCCCAATGC CGGGAAAAAG 2760 
CAAGACCACA CCATTGTG<av CACAGTGCTG ATGGOGCCCC GCTCX3GCCAA GCCGGCCCTG 2820 
CTGAAGCTTA CAGAGAAGCA OGTGGAACAG AGGGCCTTCC AOGACCTCAA GGTGGCCCCC 2880 
G6CTACTACA CCCTCACTGC AGACCftGSAC GCCCS3Q6QCA TGGTGGAGTT CCAGGAGGGC 2940 
GTGGAGCTGG TGGACGTACO GOTGCCXCTC TTTATCCGGC CTGAGGATGA CGACGA6AAG 30 00 
CAGCTGCTGG TGGAGGCCAT CGACGTGCCC GCAGGCACTG CCACCCTCGG CCX3C0GCCTO 3060 
GTAAACATCA CCATCATCAA GGAGCAAGCC AGAGACGTGG TGTCCrTTGA GCAGCCTOAO 3120 
TTCTCGGTCA QCCGCGGGGA CCAGGTGGCC CGCATCCCTG TCATCCGGCG TGTCCTGQAC 3180 
GGCGGGAAOT CCCAGOTCTC CTACCGCACA CAGGATGGCA CCGCGCAGGG CAACCGGGAC 3240 
TACATOXCG TGGAGGGTGA GCTGCTGTTC CAGCCTGGGG AGGCCTGGAA AGAGCTGCAG 3300 
GTGAA6CTCC TGGAGCTGCA AGAAGTTGAC TCCCTCCTGC GGGGCCGCCA GGTCaSCCGT 3360 
TTCCACGTCC AGCTCAGCAA CCCTAAGTTT GGGGCCCACC TOGGCXZAGCC CCACTGCACX: 3420 
ACCATCATCa TCAGGGACCC AGATGAACTG GACCGGAGCT TCACQAGTCA QATGTTGTCA 3480 
TCACAGCCAC CCCCTCACG6 CGACCTGGGC GCCCCGCAGA AGCCCAATGC TAAGGCCGCT 3540 
GGGTCCAOGA AGATCCATTT CAACTGGCTQ CCCCCTTCTG GCAAGCCAAT GGGGTACAGG 3600 
GTAAAGTACT GGATTCAGGG TGACTCCGAA TCCGAAGCCC ACCTGCTC36A CAGCAAGGTG 3660 
CCCTCAGTGG AGCTCACCAA CCTGTACCCG TATTGOGACT AaPGAGATGAA GGTOTGOGCC 3720 
TACGGGGCTC AGGGCGAGGG ACCCTACAGC TCCCTGGTGT CCTGOCGCAC CCAOCAGGAA 3780 
GTGCCCAGCG AGCCAGGGCG TCTGGCCTTC AATGTCGTCT CCTCCACGGT GACCCAGCTG 3840 
AGCTGGGCTG AGCCGGCTGA GACCAACGGT GAGATCACAG CCTACGAGGT CTGCTATGGC 3900 
CTGGTCAAOS ATGACAACGG ACCTATTOGG CCCATGAAGA AAGTGCTGGT TGACAACCCT 3960 
AAGAACCGGA TGCTGCTTAT TGAGAACCTT OGGOAQTCCX: AGOCCTACOQ CTACAOGGrQ 4020 
AAGGC6CGCA AOGGGGCCGG CTGGGGGCCT GAGCGGGAGG CCATCATCAA CCTGGCCACC 4080 
CAGCCCAAGA GGCCCATGTC CATCCCCATC ATCCCTGACA TCCCTATCOT GGACGCCCAG 4140 
AGCGGGGAQG ACTACGACAG CTTCCTTATG TACAGC6ATG ACGTTCTACG CTCTCCATCG 4200 
GGCAGCCAGA GGCCCAGCGT CTCOGATGAC ACTGAGCACC TGGTGAATGG CCG6ATGGAC 4260 
TTTGCCTTCC OGGGCAGCAC CAACTCCCTO CACAGGATGA CCACGACCAG TGCTGCTGCC 4320 
TATGGCACCC ACCTGAQCCC ACACQT6CCC CACOGCGTGC TAflfiCACATC CTCCACCCTC 4380 
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ACAOGGOftCT ACAACTCACT 0ACOC6CTCA GAACACTCAC ACTOGACCAC ACTGCOGAiGG 4440 

GACTACTCX3V CCXTTCACCTC CGTCTCCTCC CAOGACTCTC GCCTGACIGC TGGTGTGCCC 4500 

GACACGCCCA CCCGCCTQGT GTTCTCTGCC CTGOGGCCCA CATCTCTCAG AGTGAGCTGG 4560 

CAOCZAGCOQC GGTGOSAGGG GC06CT6CAG GGCTACAGTG TGGAGTACCA GCTGCIGAAC 4620 

GOGGGTOAOC TGCATOSGCT CAAGATGCCC AAOCCTOCCC AGACCTCGOT GQTGGTGOAA 4680 

6ACCTCCTGC CCAACCACTC CTAOGTGTTC 08CGTG0G66 CCCAGA6CCA G6AA06CTG6 4740 

GGCCGAQAQC GTGAGGC3TGT C3VTCACCATT GAATCCCAGG TGCACCCGCA GAGCCCACTG 4B00 

TCTCCCCTGC CAGGCTCCGC CTTCACTTTG AGCACTCCCA GTGCCCCAGG CCCGCTGGTG 4860 

TTCACTGCCC T6AGCCCA6A CTC3GCTGCAG CTGAGCTGGG AGCGGCCACX3 GAGGCOCAAT 4920 

0GGGATATC8 T0Q6CTACCT GGTGAGCTGT GAGATG6GCC AAG0A6GA6G GCCAOCCACC 4980 

6CATTC06GG fGOATGOAGA CAGCCCOGAa AGC0QGCT6A CGGTGCOQGG CCTCAGC6AG 5040 

AACGTGCCCT ACAAGTTCAA GGTSCAGGCC AGGACCACT6 AGGGCTT06G GCCAGAG06C 5100 

GAGGGCATCA TGACGATAOA GTCCCAGGAT GGAGGACCCT TCCCGCAGCT 6GGCAGCCGT 5160 

GCaSGGCTCT TCCAGCACCC OCTGCAAAQC GAGTAC3U5CA GCATCACCAC CACCCACACC 5220 

AG05CCACX:G AGOOCTTCCT AGTG6ATGGG CGGACCCTGG GGGCCCAGCA CCTGGAGGCA 5280 

OGOGGCTCCC TCACCGGGCA TGTGACCCAG GAGTTTGTGA GCOGGACACt GACCAOCAGC 5340 

GGAACCCTTA GCACCCACAT GGACCAACAG TTCTTCCAAA CTT6ACCX5CA CCCT6CCCCA 5400 

CCXXXX3CCAT GTCCCACTAG GCGTCCTCCC GACTCCTCTC CCGGAGCCTC CTCAGCTACT 5460 

CCATCCTTGC ACCCCTGGGG GCCCaGCXXJ^ CCCGCATGCA CftGAGCAQGG GCTAGGTOTC 5520 

TCCTGG6AGG CATGAAGGGG 6CAAG6TCCG TGCTCTGTGG GCCCA^CCT ATT TOTAAO C 5580 

AAAGAGCTGG OAGCAGCACA AGGACCCAGC CiTHi ' n ' Cai CACTTAATAA ATGOTTTTGC 5640 
ACTG 



Seq ID NO: 130 Protein sequence t 
protein Accession #t NP_000204 

I 11 21 31 41 51 

II I I I I 

MAOPRPSPWA RLLLAALISV SLSGTLANRC KKAPVKSCTE CVRVDKDCAY CTDBMFRDRR 60 

CHTQAELLAA GOQRESIWM ESSPQITEBT QIDTTLRRSQ MSPQGLRVRL R PGEK RHPEL 120 

EVFEPLESFV DLYILMDPSN SMSDDUDNLK KMGQNLARVL SQLTSDYTI6 FSKPVDKVSV 180 

PQTDMRPEKL KBFWPNSDPP FSFKNVISLT EDVDBPIINKL QGBRISGNIiD APEG6FDAIL 240 

QTAVCTRDIG WRPDSTHLLV PSTESAFHYB ADGANVLAGI MSRNDBRCHI. DTTGTYTQYR 300 

TQDYPSVPTL VRIjLAKHNII PIPAVTNYSY SYYBKLHTYP PVSSLQVLQB DSSMIVELLB 360 

BAFNRIRSNL DIRALDSPRG LRTBVTSKMF QKTRTGSFHI RRGEV6IYQV QIiRALEHVDG 420 

7HVQQLPEDQ KCailHLKPSF SDGLKMDAGI XCDVCTCBLQ KBVRSARCSF N6DFVCGQCV 480 

C8BGWSO0TC NCSTGSIiSDI OPCLRBGEDK PCSQBaBGQC (mCVCYOBOR yBQQFCEYDH 540 

FQCPRTSGPL CNDR6RCSKG QCVCEPGWTG PSCDCPLSNA TCZDSN66IC NGR6HCXI06R 600 

CHCHQQSLYT DTICEINYSA IHPGLCEDLR SCVQCQANGT GBKKGRTCEE CNPKVKMVDE 660 

LXRAEEVWR CSFRDEDDDC TVSYTMK5DG APOPKSTVLV HKKKDCPFGS FVmLIPLLLL 720 

LIiPLLAIiLIiL LCHKYCACCK ACLALLPCCH RGHMV6FKED HYMLRENIMA SDHLXTTPMLR 780 

SGNLKORDW RNKVTNNKQR PGFATBAASI NPTELVPYGL SltRLARLCTE NLIiKPDTREC 840 

AQLRQBVEEM LNEVYRQISG VHKLQQTKFR QQPNAGKRQD HTIVDTVLMA PRSAKPALItR 900 

LTEKQVEQRA FHDLKVAPGY YTLTADQDAR GMVBFQEGVE LVDVRVPLPI RPEDDDEKQL 960 

LVBAIDVPAG TATLGRRLVN ITIIKBQARD WSPEQPEFS VSRGDQVARI PVIRRVU5GQ 1020 

KSQVSYRTQD GTAQGNRDYI PVBGELLFQP GBAWKEZX3VK LIiELQBVDSL LRGRQVRRFH 1080 

VQLSNPKFGA HZiGQPHSTTI IXRDPDELQR SFT8QMLSSQ PPPHGDLGAP QNSHAKAAGS 1140 

RXIRFNWIiPP S6KFMGYRVK YHIQGDSESE ABLIHSKVPS VELTHLYFYC DYEHKVCAYG 1200 

AQGEGPYSSL VSCRTHQEVP SEP6RLAFNV VSSTVTQLSW ABPAETOGEI TAYEVCYGLV 1260 

NDDNRPIGPM KKVLVDNPKN RMLLIEKLRE SQPYRYTVKA RNGAGWGPER EAIINIATQP 1320 

KRPMSZPIIP DIPIVDAQSO EDYDSFLMYS DDVIiRSFSGS QRPSVSDDTE HLVNGRMDFA 1380 

FPGSTHSLHR MTTTSAAAYO TBZiSPBVPHR VbSTSSTLTR DYNSLTRSEH SESTTLPROY 1440 

STLTSVSSHD SRLTAGVPDT PTRLVFSALO PTSLRVSWQE FRCBRPLQ6Y SVEYQLUIGG 1500 

BLHRLNIPNP AQTSWVEDL LPNHSYVPRV RAQSQEGWGR ERBGVITIES QVHPQSPLCP 1560 

LPGSAFTLST PSAPGPLVFT AIiSPDSLQLS WERPRRPNGD IVGYLVTCEM AQGGGPATAF 1620 

RVlXasSPESR LTVPGLSENV PYKFKVQART TE6FGPEREG IITIESQOGG PFFQLGSRAG 1680 

XiFQHPLQSEY BSXTTTBT8A TEPFLVD6PT LSAQBLBAGG BbTRBVTQEF V8RTLTT8GT 1740 
LSTHMDQQPP QT 

Seq ID NO: 131 DMA sequence 
Mtideic Acid Accession #« BC004372 
Coding sequence) 132.. 2231 

1 11 21 31 , 41 51 

1-1 I I I I 

CCTCQTGCCO 0GGACCCCA6 CCTCTGCCAG GTTCGGTCCG CCATCCTCGT CCCGTCCTCC 60 

GCG6GCCCCT GCCOCGCGCC CA6G6ATCCT CCAGCTCCTT TCGCCCXSCGC CCTCCGTTCG 120 

CTCCQGACRC CAT66ACAAG TTTTGCTGGC ACGCAGCCTG GGGACTCTGC CTCGTGCCX3C 180 

TGAGCCTGGC GCAGATOGAT TTGAATATAA CCTGCC6CTT TGCAGGT6TA TTCCAOGTGG 240 

AGAAAAATGG TCGCTACAGC ATCTCTC3GGA CGGAGGCCGC TGACCTCTGC AAGGCTTTCA 300 

ATAGCACCTT GCCCACAATG GCCCAGATG6 A6AAAGCTCT GAGCATGGGA TTTQA6ACCT 360 

GCACGTATG6 GTTCATAGAA G66CATGTG6 T6ATTCCCCG QATCCACCCC AACTCCATCF 420 

G1GCA6CAAA CAACACAG6G GTGTACATCC TCACATCCAA CACCTCCCAG TATGACACAT 480 

ATTGCTTCAA TGCTTCAGCT CCACCTGAAG AAQATTGTAC ATCAGTCACA GACCTQCCCA 540 

ATGCCTTTQA TGGACCAATT ACCATAACTA TTGTTAAOCG TGATGGCACC OGCTATGTCC 600 

AOAAAGGAGA ATACAGAAOQ AATCCTGAAG ACATCTACCC CAGCAACCCT ACTGATGATG 660 

ACGTGAGCAO OGGCTCCTCC AOTGAAAGQA GCAGCACTTC AGGAGGTTAC ATCTTTTACA 720 

CCTTTTCTAC TGTACACXTCC ATCCCAGAOG AAQACAGTCC CTGGATCACC GACAGCACA6 780 

ACAGAATCCC TGCTACCAQT A06TCTTCAA ATACCaTCTC AGCAGGCT6G GAGCCAAATG 840 

AA6AAAATGA AGATGAAAGA GACAGACACC TCASTITETC TGGATCAGGC AT7GATGATG 900 

ATGAAQATTT TATCTCCAGC ACCATTTCAA CCACACCAOG GGCTTTTOAC CACACAAAAC 960 

AGAACCAGGA CTGGACCCAG TGGAACCCAA GCCATTCAAA TCXXSGAAGTG CTACTTCAQA 1020 

CAACCACAAG GATGACTGAT GTAGACAGAA ATGGCACCAC TOCTTATGAA GGAAACTGGA 1080 

ACCXIAGAAGC ACACCCTCCC CTCATTCACC ATQAGCATCA TQAGGAAGAA GAOACCCCAC 1140 

ATTCTACAAG CACAATCCAG GCAACTCCTA GTAGTACAAC 6GAAGAAACA 6CCACCCA6A 1200 

AGQAACAGTO QTTTGGCAAC A6ATGGCAT6 AG6GATATCG GCAAACACCC AOAGAAGACT 1260 
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CCaVTTOSAC AACAGGGAOV GCT6CAGCCT 
GQACAACACC AAOCCCAGAG QACAGTTCCT 
CCATQGGACQ AGGTCATCAA GCAGOAAGAA 
CGCTTCAGCC TACIGCAA^T QCMMCPtXAQ 
CTCTTTCAAT GACAAGGCAO CAGASTAATT 
TG6AA6AAGA TAAAGACCAT CCAACAACTT 
TCACAGGTGG AAGAAGA6AC CCAAATCATT 
ATACCTCTCA TTACCCACAC ACGAAGGAAA 
AGACT6G6TC CTTTGGAGTT ACTGCAOTTA 
GTTCCTTATC AOGAGACCAA GACACATTCC 
6ATCTGAATC AGATOGACAC TCACATGGQA 
GTCCTATAAG GACACCCCAA ATTCCAGAAT 
TGGCTTTGAT TCTTGCAGTT TGCATTGCAG 
AAAAGCTAGT GATCAACAGT GGGAATGGAG 
AGGGAGAOGC CTOC AAGTCT CAGGAAATGO 
CTCCAGACCA QTTTRTt^CA OCTGATGAGA 
TTQGGGTGTA ACACCTACAC CATTATCTTO 
TACAQGGAGC TGGGACACTT AACAOATGCA 
TTTTTAGCAT AAAATTTTCT ACTCTTAAAA 



Seq ID HO: 13Z ProtelA eequences 
Protein Accession AAH04372 



CA6CTCATAC 
GGACTGATTT 
GGATGGATAT 
GTXTGGTGGA 
CTCAGAGCTT 
CTACTCTGAC 
CTGAAGGCTC 
GCAGGACCTT 
CTGTTGGAGA 
ACCGCAGTG3 
6TCAAGAAGG 
GGCTGATCAT 
TCAACAGTC3G 
CTGTGGAGGA 
TGCATTTGGT 
CAAGGAACCT 
6AAAGAAACA 
ATGTGCTACT 
AAAAAAAAAA 



CA6CCATCCA 
CTTCAACCCA 
GGACTCCAGT 
AGATTTGGAC 
CTCTACATCA 
ATCAAGCAAT 
AACTACTTTA 
CATCCCAGTO 
TTCCAACTCT 
OGOOTOOCAT 
TGGA6CAAAC 
CTTGGCATCC 
AAGAAGGT6T 
CAGAAAGCCA 
GAACAAGGAO 
GCACSiATGTG 
ACCGTTGGAA 
GATTGTTTCA 
AAAAAAA 



ATGCAAGGAA 
ATCTCACACC 
CATAGTACAA 
AGGACAGGAC 
CATGAAG6CT 
AGGAATGATG 
CTGGAAGGTT 
ACCrCAGCTA 
AATGTCAATC 
ACCACrCATG 
ACAACCTCTG 
CrCTTGGCCT 
GGGCAGAAGA 
AGTGGACICA 
TCX3TCAGAAA 
GACATGAAGA 
ACATAACCAT 
TTGOQAATCT 



Seq ID NO: 133 DNA sequence 
Nucleic Acid Accession #: MM_002882 
Coding sequence: 150-755 * 



1 
1 

CGAG6TTCGG 
6C3GGAG6GAA 
AGCC6AGCCG 
ATGATACTTC 

TGCGGGCAAA 
GGACTGGTGA 
G6AGGGACAA 
AGCCXAACGC 
AGTGCCCCAA 
TCAAAACAAA 
CAG6C3\AAAA 
AGGAGACCAA 
TCTCTTTCCT 
ATTCTTTCAT 



11 

I 

GTOGXGGGGC 
GGAGCTAOGA 
C0GCC6CC6C 
CACTGAGAAT 
TGAGCAAGAA 
ACTGTTCOQA 
OGTCAAGCTC 
GACCCTGAA6 
AGGTAGOSAC 
6CCAGAGCTG 
GTTTGAAGAA 
TGATCATGCC 
GGAG6ATGCT 
TTCCTTTTTT 
TTTTACAAGG 



21 

I 

GQAGGGAAGA 
GTA6C0GC06 
CGOGCCCCCA 
ACAGAOGAGT 
ATTAAAACAC 
TTTGCCTCTG 
CTGAAGCACA 
ATCTGTGCCA 
OGTGCCTGGG 
CTGGCCATCC 
TGCAGGAAA6 
6AAAAAGTG6 
GAGGAGAAGC 
TAAAAAATTT 
GAOSTTATAT 



31 
! 

6GGGG0SG6C 
AOAGGCOGCXS 
TGGC6GC0GC 
CCAAOCATGA 
TGGAAGAAGA 
AGAAOGATCr 
AGGAGAAAGG 
ACCACTACAT 
TCTGGAACAC 
GCTTCCTGAA 
AGATOGAAGA 
CQ6AAAAGCT 
AATAAATCGT 
TACCCTGCCC 
AAAGAACTGA 



41 
I 

GGGABGCX3GC 
GAGCCAGCGA 
CAAG6ACACT 
CCCTCAGTTT 
TGAA6AGGAA 
CCCAGAATGG 
GGGCATCCX3C 
CACGCCGATG 
QUAGGCTGAC 
TGCTGAGAAT 
GAGAGAAAAG 
AGAAGCTCrC 
CTTATTTTAT 
CTCTTTTT06 
ACTC 



51 
I 

GGOGOCAGAC 
CGAC06ACCC 
CATGAGGACC 
GAGCCAATAG. 
CTTTTTAAAA 
AAGGA6CGAG 
CTCCTCAT6C 
ATGGAGCTGA 
TTCGCX33ACG 
GCACAGAAAT 
AA AGCA GGAT 
TCGGTGAAGG 
TTTCTTTTCC 



Seq ID NO: 134 Protein sequence: 
Protein Accession #: NP_002873 



11 



21 



41 



51 



31 

I I I I 1 i 

MAAAKDTHED HDTSTENTDE SNHDPQFEPI VSLPEQEIKT LEEDEEHLFK MSAKLFRFAS 
ENDLPEHKER GTGDVKIiLKH KEKGAIRLLH RRDKTLKICA NHYITPMMBL KPNAGSDRAN 
WfNTBADFAD ECPKPBLLAI RFLNAENAQK FRTKFBBCRK BZEBRBKKA6 SGKHDHAEKV 
AEKLBALSVK EETKEDAEBK Q 

Seq ID NO: 135 DMA sequence 

Nucleic Acid Accession #* NM_000077.2 

Coding sequence: 277-742 



CCCAACCTGG 
TCCTCOQAGC 
GGATTTGA6G 
GGGCTGGCTG 
6GAQAGCAGG 
GCCTTCGGCT 
6CTGCTG6AG 
GGTCATGATG 



11 
1 

GGGGACTTCA 
ACTG6CTCAC 
GACAGGGTOG 
GTCACCAGAG 
CAG06GGCGG 
GACTGGCTGG 
GCOGGGGOGC 
ATGGQCAGOG 



21 

I 

GGTGTGCCAC 
GGOQTCCXXT 
GAG66GGCTC 
GGTGGGG06G 
CGGG6AGCAG 
GCACGGCCGC 
TGCCCAA06C 
COCQAOTGGC 



31 
1 

ATTCGCTAAG 
TGCCTGGAAA 
TTCCGCCAOC 
AC06CGTGGG 
CATGGA6CCG 
6GCCCGG6GT 
ACCGAATAOT 
GGA6CTGCT6 



41 
I 

TGCTCGGAGT 
GATACCGCOQ 
ACOGGAGGAA 
CTCGGOGGCT 
GCGGCGGGGA 
CGGGTAGAGG 
TAGGGTCX3GA 
CTGCTCCA06 



51 

i 

TAATAGCACC 
TCCCTCCAGA 
GAAAGA6GAG 
6CGGAGAGGG 
GCAGCATGGA 
AGGT6CGGGC 
G0C08ATCCA 
GOGGGGAGCC 



1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
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1 


11 


21 


31 


41 


51 




1 

MDKFWWHAAW 


1 . 

GLCLVPLSIA 


( 

QIDLNITCRP 


1 

AGVFHVEKNG 


1 

RYSISRTEAA 


1 

DLCKAFNSTL 


60 


PTMAQMEKAL 


SIGFETCRYG 


FIEGHWIPR 


IHPNSICAAN 


NTGVYILTSN 


TSQYDTYCFN 


120 


ASAPPEEDCT 


SVTDLPKAFD 


GPITITIVNR 


DGTRYVQKGE 


YRTNPBDIYP 


SNPTDDDVSS 


180 


GSSSERSSTS 


GGYIPYTPST 


VRPIPDEDSF 


WITDSroRIP 


ATSTSSNTIS 


AQHEPNEENE 


240 


pSRDRBLSFS 


GSGXDDDEDF 


ISSTISTTPR 


AFOHTKQNQD 


WT0HNP8HSN 


PEVLLQTTTR 


300 


MTDVDfiNGTT 


AYEOOHNPEA 


HPPLIHHEHH 


EEEETPHBTS 


TIQATPSSTT 


EBTATQKEQW 


360 


FGNRWREGYR 


QTPBEDSHST 


TGTAAASAHT 


SHPMQGRTTP 


SPEDSSWTDF 


FNPISKFKGR 


420 


GHQAGRRMm 


DSSHSTTIiQP 


TANPNTGLVE 


DLDRTGPIiSM 


TTQQSNSQSF 


STSHEGLESD 


480 


KDHPTT6TLT 


8SNBNDVTGG 


RROPNHSEGS 


TTItliEGYTSH 


YPHTKESRTF 


IPVTSAKTGS 


540 


FGVTAVTVGD 


SNSNVNRSL8 


GDQDTFHPSG 


GSHTTHGSBS 


DGHSEGSQE6 


GANTTSGPIR 


600 


TPQIPEWLII 


liASIiLAliALI 


IiAVCIAVNSR 


RRCGQKKKLV 


XNSGN6AVED 


RKPSGUIGEA 


660 


SKSQHKVHLV 


NKESSBTFDQ 


PMTADETRNL 


QNVDMKIGV 









60 
120 
ISO 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 



60 
120 
180 



60 
120 
180 
240 
300 
360 
420 
480 



239 



5 
10 
15 
20 
25 
30 
35 
40 
45 



WO 02/086443 

CAACTGCX3CC 6ACGCC6CCA CTCTCACCCG ACCGGTGCAC GACQCT6CCC GGOAGGGCTT 
CCTOGACACG CTGGTGGTGC TGCACCGGGC CGGGGOGCQG CTGQACJSTGC GCGATGCCTG 
GGGC0GTCT6 CCCGTGGACC TGGCTGAGGA QCTGGGCCAT CGCGATGTCQ CAOGGTACCT 
GOGOGCGGCT GCG66G66CA CCAOAGGCAQ TAACCATGCC CQCATAGATG CCGOGGAAGG 
TCCCTCAGAC ATCCCOQATT OAAAOAACCA GAOAGGCTCT GAOAAACCTC GGGAAACTTA 
GATCATCAGT CACCX3AAG6T CCTACAG6QC CACAACTGCC CC06CCACAA CCCACCC06C 
TTTCGTAGTT TTCATTTAGA AAATAOAGCT TTTAAAAATO TCCTGCCTTT TAACXSTAGAT 
ATATGCCTTC CCCCACTACC OTAAATGTCC ATTTATATCA TTTTTTATAT ATTCTTATAA 
AAATGTAAAA AA6AAAAACA G0GCTTCT6C CTTTTCACTG TGTTGGAGTT TTCTGGAOTG 
J«5avCTCA0G aXTAAGCGC ACATTCATOT GGGCATTTCT TGC3QA6CCTC GCAGCCTCGG 
GAAGCTGTOB ACTTOVrGAC AAGCATTTTG TGAACTA6G6 AAGCTCAGGG GGOTTACTGG 
CTTCTCTTGA GTCAC3kCTGC TAGCAAATGG CAGAACCAAA GCTCAAATAA AAAXAAAATA 
ATTTTCATTC ATTCACIC 

Seq ID MOt 136 Protein sequeneei 
Protein Accession #t NP_O0 0068.1 

1 11 21 31 41 SI 

1 I I I I I 

MEPAA6S5ME PSADWLATAA ARG»VBBVRA LLBAOAXiPHA SNSYGRRPIQ VMMM6SASVA 
ELLIiIfiGAEP HCADPATLTR PVHDAAREQP UmiWUStA GARLDVItDAH GKIiFVDLAEB 
LGHRDVARYL RAAAOGTRGS NHARIDAAEG PSDIPD 



Seq ZD NOt 137 DNA sequence 

Nucleic Acid Accession «: NM_058196.1 

Coding sequence: 104-421 



TGT6TGGGGG 
GCCCCCACCC 
C06AGTGG06 
TCTCAGCOGA 
QCACCQG60C 
GGCTQAGGAG 
CAGAGGCAGT 
AAAGAACCAG 
CTAOVGGGCC 
AATAQAGCTT 
TAAATGTCCA 
CGCTTCTGCC 
CATTCATGTG 
AGCATTTTGT 
AOCAAATGGC 



11 

I 

TCTGCTTGGC 
TGGCTCTGAC 
GAGCTGCTQC 
CCCX3TGCA06 
QGGG0GCX3GC 
CTGGGCCATC 
AACCATGCCC 
AGAGGCTCTG 
ACAACTGCOC 
TTAAAAATGT 
TTTATATCAT 
TTTTCACTQT 
GGCATTTCTT 
GAACTAGGOA 
AGAACCAAAO 



31 

I 

GGTGAGGGGG 
CATTCTOTTC 
TGCTCCACGQ 
AOGCT6CCOG 
TGGAOSTGCG 
G0GATGTC5GC 
GCATAGATGC 
AGAAACCTCG 
COGCCACAAC 
CCTGCCTTTT 
TTTTTATATA 
GTTG6A0TTT 
GCGAGCCTCO 
AGCTCAG6G0 
CTCAAATAAA 



31 

1 

CTCTACACAA 
TCTCTGGCAQ 
CGCGGAGCCC 
GGAGGGCTTC 
C6ATGCCTGG 
ACGGTACCTG 
CGCGGAAOGT 
GGAAACTTA6 
CCACOXGCT 
AACGTAGATA 
TTCTTATAAA 
TCTGGAQTGA 
CAQCCVCOGQ 
G6TTACTGGC 
AKTAAAATAA 



41 

I 

GCTTCCTTTC 
GTCATGATCA 
AACTGCGCOG 
CTGGACACGC 
GGCOGTCTGC 
CGGGCGGCTG 
CCCTCAGAGA 
ATCATCA8TC 
TTOGTAGTTT 
TAAGCCTTCX: 
AATGTAAAAA 
6CACTCAC6C 
AASCTGTCQA 
TTCTCTTGAG 
TTTTCATTCA 



S40 
600 
660 
720 
7B0 
840 
900 
960 
1020 
1080 
1140 
1200 
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51 

I 

C6TCAT6CC6 
TGGGCAG06C 
ACCCCJGCCaC 
TGGTGOTGCT 
CCGTGGACCT 
CXSGGGGGCAC 
TGOCGSATTG 
ACOQAAGGTC 
TCATTTAGAA 
CTCACTACCG 
AIQAAAAACAC 
CCTAA60GCA 
CTTCATGACA 
TCACACTOCT 
TTCACrC 



60 
120 



60 
130 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 



50 
55 
60 
65 
70 
75 
80 
85 



Seq ID NO: 138 Protein sequence: 
Protein Accession «s NF_478103.1 

1 IX 21 31 41 SI 

I I I I I I 

MMMGSARVAE LUjLHGAEFN CADPATItTRP VBDAABB8PL DTLWLBRAfi ARUJVRDAWQ 
RIiPVDUVEEL GHRDVARYIiR AAAGGTROSH HARIDAAEGP 8DIPD 

Seq ID KO: 139 DNA sequence 

Nucleic Acid Accession ftt NM_058197.1 

Coding sequence I 273-684 



60 



CCCAACCT6G 
TCCTCCX5AGC 
GGATTTGA6G 
GGGCTGGCTG 
GGAGAGCAG6 
GCOGG06606 
GGGrTCOGGTA 
TAGTTACGGT 
C3GGGCGACTC 
CXX3GAAAAA6 
TOCTG6QSAC 
ACAGATCTCT 
TCATGATGAT 
ACTGOSCXXSA 
TGGACACGCT 
GC06TCTGCC 
QOGOGGCTQC 
CCTCAGACAT 
CATC AGTCA C 
CGTAGTTTTC 
TGCCTTCCCC 
TGTAAAAAAG 
ACTCACQCCC 
GCTGTCGACT 
CTCTTGAOTC 



11 

I 

G60GACTTCA 
ACTOGCTCaC 
GACAGGGTCX3 
GTCACCAGAG 
CAGCX3GGCG6 
GOGAGCAGCA 
GAGGA66T6C 
CGGAG6CCX3A 
TGGAGGACGA 
QGGAGGCTTC 
GCOCTGGGGG 
GGAATGCT6A 
6GGCAGGGCC 
CCCCX3CCACT 
GGTGGTGCTG 
OGTGGACCTG 
GGGGGGCACC 
CCCCGATTQA 
CGAAGGTCCT 
ATTTAGAAAA 
CACTACOGTA 
AAAAACACCG 
TAAGCGCACA 
TCATGACAAG 
ACACTGCTAG 



21 
I 

GGTGT6CCAC 
G6CX3TCCCCT 
GAGGGG6CTC 
GGTQGGGCX3G 
CGGGGAGCAO 
TGGAOOCTTC 
GG8C6CTGCT 

tcx:aggtggg 
agtttgcaog 

CT6GGGAGTT 
CTTGGGAAAC 
GAAGATCTGA 

CGAGTGGCGG 
CTCACCCGAC 
CACCGGGCCG 
GCT0AG6A6C 
AGAGGCAGTA 
AAGAACXIAGA 
ACAGGGCCAC 
TAGAOCTTTT 
AATGTCCATT 
CTTCTGCCTT 
TTCATGTGGG 
CATTTTGTGA 
CAAAT66CA0 



31 
I 

ATTCGCTAAG 
TGCCT66AAA 
TTC0GCX3VGC 
ACOGCXTTQCX; 
CATGGAGC06 
GGCTGACTG6 
GQAGGOOQGG 
TAGAAGGTCT 
GGAATTGGAA 
TTCAGAAGGG 
CAAGGAAGAO 
A666G6GAAC 
AGCTGCTQCT 
CCGTGCACGA 
GGGOGOGGCT 
TGQGCCATCG 
ACCATGCCOG 
GAGGCTCTGA 
AACTGCCCCC 
AAAAATGTCC 
TATATCATTT 
TTCACTGTGT 
CATTTCTTGC 
ACTAOaOAAG 
AACCAAAGCT 



41 

1 

TQCTOGGAOT 
GATACC!GC!G6 
ACCGGAGGAA 
CTCGGCGGCT 
GCGGCGGGGA 
CT6GCCACGG 
GCGCT6CCCA 
GCAGCGGGAG 
TCAGGTAGOG 
GTTTGTAATC 
GAATGA6GAG 
ATATTTGTAT 
GCTCCACGGC 
0GCTGCCCG6 
QGAOSTQCGC 
OGATQTOGCA 
CATA6ATGCC 
GAAACCTCGG 
GCCACAACCC 
TGCCTTTTAA 
TTTATATATT 
TGGAGTTTTC 
GA6CCTC6CA 
CTCA6GQ6GQ 
CAAATAAAAA 



51 
I 

OIAATAGCACC 
TCCCTCCAGA 
GAAAGAGGAG 
GCGGAGAGGG 
GCAGCATGGA 
CCGCXX3CCC6 
ACGCACCGAA 
CAGGGGATGC 
CTTCGATTCT 
ACA6ACCTCC 
CCACGCX306T 
TAGATGGAAG 
GCGGAGCCCA 
QAGGGCTTCC 
GATGCCTGGG 
CGGTACCTGC 
GCGGAAGGTC 
GAACTTAGAT 
ACX:CCGCTTT 
OGTAGATATA 
CTTATAAAAA 
TGGAGTQAGC 
6CCTC0GGAA 
TTACTG6CTT 
TAAAATAATT 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 



240 
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TTCATTCATT CACTC 



Seg ID NO: 140 Protein sequence t 
Protein Accession #s NP_47B104.1 

1 11 21 31 41 51 

I I I I I I 

MBPAAOSSMB PAA6SSNBPS ADWLATAAAR GECVEEVRALL EAGALPKAPN SYGSRPZQVG 60 

RRSAAGflGDG ^HRTKFAG ELESOSASIL RKKGRIiFGEF SE6VCNHRPP PGDAL6AWET 120 

KBEE 

Seg ID NO: 141 DMA sequence 

Nucleic Acid Accession #: 1JM_058195.1 

Coding sequence: 163-684 ~* 

1 11 21 31 41 51 

) I 1 i ! 1 

CCTCCCTACG GGCGCCTCCQ GCAGCCCTTC CCGCGTGCGC AGGGCTCAGA GCC6TTCCQA 60 

GATCTTOQAG GTCC3GG0TGG GAGTSGG6GT GGGGTGGGGG TGGGGGTGAA GGTOGGGGGC 120 

QQGOGOGCrC AGGGAAGGGS GGTGOGOGCC TGGGGGGOSG AGATQGGCaQ GGG6CGGTGC ISO 

6T0GGTCCCA GTCTGCAGTT AAGGGGGCAG GAGTG6CGCT GCTCACCTCT GGTOCCAAAG 240 

GGOGGCQCAG CQ6CTGCCX3A GCTCX3GCCCT GGAGGCGGCG AGAACATGGT GCGCAGGTTC 3O0 

TTGGTGACCC TCCGGATTCG GCG0GCX3TGC GGCXTCGCCJGC GAGTGAGGGT TTTCX3TG6TT 360 

CACATCCC6C GGCTCACGGG GGAGT6GGCA GC6CXAGG6G CGCCCGCCGC TGTGGCCCTC 420 

GTQCTGATGC TACTGAGGAG CCAOOGTCTA GG6CAGCA6C CX3CTTCCTAG AAGACX»GGT 480 

CATGATGATG GGCAGCGCCC 6AGTG6C3G0A 6CTQCTGCT0 CTCCA06GC3Q OGGAGCCCAA 540 

CTGGGCC6AC CCCGCXaCTC TCACCCGACC CGTQCAC6AC GCTGCCCGGG AGGGCTTCCT 600 

GGACACGCTG GTGGTGCTGC ACCGGGCCGG GGCX3CGGCTG QACJGTGCGOG ATGCCT6GGG 660 

CCGTCTGCCC GTGGACCT6G CTGAGGAGCT GGGCCATCX3C GATGTOGCAC GGTACCTGQG 720 

CGCGGCTG06 GGGGGCACCA OAGGCAGTAA CCATGOCCiGC ATAGATGC06 OSGAAGGTOC 780 

CTCAGACATC CCC6ATTGAA AGAACCRGAG AGQCTCrOAO AAA0CTC3GGG AAACTTAQAT 840 

CATCAGTCAC CGAA66TCCT ACAGGOCCAC AACTGCCCCX: GCCACAACCC ACCCCGCTTT 900 

CGTAGTTTTC ATTTAGAAAA TAOAGCTTTT AAAAATGTCXr TGCCTTTTAA CGTAGATATA 960 

TGCCTTCCCC CACTACCGTA AATGTCCATT TATATCATTT TTTATATATT CTTATAAAAA 1020 

TQTAAAAAAG AAAAACACCX3 CTTCTGCCTT TTCACTGTGT TGGAGTTTTC TGGAGTGAGC 1080 

ACTGACGCCC TAAGCGCACA TTCATGTGGG CATTTCTTGC GAGCCTOGCA GCCTCXJaaAA 1140 

GCTGTGGACT TCATGACAAG CATTTTGT6A ACTAGGGAAS CTCAGGGGGG TTACTGGCTT 1200 

CTCTTGMSTC i^ChCTGCTAG CAAATG6CAG AACCAAAflCT CSUUVTAAAAA TAAAATAATT 1260 
TTCATTCATT CACTC 

Seq ID NO: 142 Protein sequence: 
Protein Accession ft: NP_478102.1 



1 11 21 31 41 51 

I 1 I i I I 

MGRGRCV6PS LQLRGQEWRC SPLVPKGGAA AAELGPGGGE NMVRRFLVTIi RIRRACGPPR 60 
VRVFWHIPR LTGENAAPGA PAAVALVLHL LR8QRLGQQP LPRRFGHDDO QRPSGGAAAA 120 
PRRGAQLRRP RHSRPTRARR CFGGLPGSA6 GAAPGRGAAG fiARCLGPeAR GPG 

Seq ID NO: 143 DNA sequence 
Nucleic Acid Accession NM_016131 
Coding sequence t 412.. 1107 



1 11 21 31 41 51 

i I 1 I t I 

GAAATTGCAC ACTTAAAGAC ATCAGTGGAT GAAATCACAA 6TG0GAAAGG AAAGCTGACT 60 

GATAAAGAOA GACAGAGACT TTTGGAGAAA ATTCGAGTCC TTGAGGCTGA QAAGGAGAAG 120 

AAT6CTTATC AACTCACAGA GAAGQACAAA GAAATACAGC 6ACTGAGAGA CCAACTGAA6 180 

GCSCAQATATA GTACTACCGC ATTGCTTGAA CAGCTQGAAG AQACAAOGAG A6AAGGAQAA 240 

AGOAGGGAGC AGGTGTTGAA AGCCTTATCT GAAGAGAAAG ACGTATTGAA ACAACAGTTG 300 

TCTGCTGCAA CCTCACGAAT TGCTGAACTT GAAAGCAAAA CCAATACACT COGTTTATCA 360 

CAGACTGTGG CTCCAAACTG CTTCAACTCA TCAATAAATA ATATTCATGA AATGQAAATA 420 

CAGCTGAAAG ATGC TCTGGA 6AAAAATCAG CAGTGGCTCG T6TATGATCA GCAGCGGGAA 480 

GTCTATGTAA AAGGACTTTT A0CAAAGATC TTTGAfiTTGG AAAAGAAAAC GGAAACAGCT 540 

6CTCATTCAC TCCCACAOCA GACAAAAAAG CCTGAATCAG AAGGTTATCT TCAAGAAGAG 600 

AAGCAGAAAT GTTACAAOQA TCTCTTGGCA AGTGCAAAAA AAGATCTTGA GGTTGAACGA S60 

CAAACCATAA CTCAGCTGAG TTTTGAACTG AGTGAATTTC GAAGAAAATA TGAAGAAACC 720 

CAAAAAGAAG TTCACAATTT AAATCA6CTG TTGTATTCAC AAAGAA6G6C AGATGTGCAA 780 

CATCTGGAAG AXGATAGGCA TAAAAGAGAG AAQATACAAA AACTCAGGGA ASAGAATOAT 840 

ATTGCTAGGG GAAAACTTGA AGAAGAGAAG AAGAGATCC6 AAGAGCTCTT ATCTCAGGTC 900 

CAGTCTCTTT ACACATCTCT GCTAAAGCAG CAAGAAGAAC AAACAAGQGT AGCTCTGTTG 960 

GAACAACAGA TGCAGGCATG TACTTTAGAC TTT6AAAAT6 AAAAACTCGA CC6TCAACAT 1020 

GTGCAGCATC AATTGCATGT AATTCTTAAG QAGCT0C6AA AAGCAAGAAA AAATAACACA 1080 

GTTGGAATCC TTGAAACAGC TTCATGAGTT TGCCATCACA (lAQCCATTAO TCACTTTCCA 1140 

AGGAGAGACT GAAAACAGAG AAAAAGTTGC CGCCTCAOCA AAAA6TCCCA CTGCTGCACT 1200 

CAATGGAA6C CTGGTGGAAT GTCCCAAGTG CAATATACAG TATCCAGCCA CTOAGCATCG 1260 

OQATCTGCTT GTCGATQTGG AATACTGTTC AAAGTAGCAA AATAAGTATT TGTTTTGATA 1320 

TTAAAAGATT CAATACTGTA TTTTCTGTTA GCTTGTGGGC ATTTTGAATT ATATATTTCA 1380 

CATTT TGCAT AAAACTGCCT ATCTACCTTT QACACTCCAG CAT6CTAGTG AATCATOTAT 1440 

CTTTTAGGCT GCTGTQCATT TCTCTTGGCA 6TGATACCTC CCTG»CATGG TTCATCATCA 1500 

GGCTGCAATG ACAGAATGTG GTGAGCAGOG TCTACTQAGA TACTAACATT TT6CACTGTC 1560 

AAAATACTTG GTGAGGAAAA GATAGCTCAG GTTATTGCTA ATGGGTTAAT GCACCAGCAA 1620 

GCAAAATATT TTATGTTTCG 6GGQTTTTGA AAAATCAAAG ATAATTAACC AAGGATCTTA 1680 

ACTGTGTTOG CATTTTTTAT CCAA6CACTT AGAAAACCTA CAATCCTAAT TTTGATGTCC 1740 

ATTGTTAAGA GGTGGTGATA GATACTATTT TTTTTTCATA TTGTATAGCO GTTATTAGAA 1800 



241 



5 

10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



GAAACTTAAC 
GQCATAATCT 
ACTT6AATTA 
CCATGTAGCC 
GGCAATOTAA 
AAACCftAAMl 
AATQAGAAIA 



CCAQCTGT6T 
TAAGTGGCCA 
CATTAGOVCA 
CTCTCATTTG 
TQATCAGATC 
CTTTTAAATT 
(3AA.TIAAATT 



WO 02/086443 

AAGTTGGGGA TTTTCTTGAT CTTTATTGCT QCTTACCATT 
TCCCCAACTC TGTTCTGCGC ACGAAACAOT ATCTGTTTGA 
CACACAATGT TTTCTCTTAT QTTATCTGQC AGTAACTGTA 
TTCTGCTTAQ CTAAAATTOT TAAAATAAAC TTTAATAAAC 
ATTOACAGTA TTTTAfiTTAT TTTTG6CATT CTTAAAGCTG 
TTTOTTTGTC TGAACAGGTA TTTTTATACA TOCTTTTTGT 
TCTTCAQQTT TTCTAACATG CTTACCACIG GQCTACTGTA 
ATTTAATGTT TT 



Seq ZD NO: 144 Protein sequence t 
Protein Accession ft: NP_060601 



1 11 31 31 41 51 

I I I I 1 I 

MSIQliKDAIiE KNOQWIiVYI^ QREVYVKGLlj AKIFELEKKT ETAABSLPQQ TKKPESESYL 
QEBKQKCYND LXASAKKDLE VSRQTITQLS FELSEFRRKY EETQKEVHNL KQLLYSQRRA 
OVQHIiEDDRH KTEKIQKLRE BNDIARGKLE EEKKRSEELL SQVQSLYTSL LKQQSBQTRV 
ALIfQQMQAC TIiDFBlIEKLD RQBVQHQLHV ILXEIiRKARK NNTVGIIiBTA 8 

Seq ID NOt 145 DNA sequence 
Nucleic Acid Accession #j 1IH_00116B 
Coding sequence: 50.. 478 



1 

I 

CCXXXaVGATT 
GAOGTTGCCC 
CTGOCCCTTC 
CCACTGCCCC 
G6AAGGCTGG 
CGCTTTCCTT 
GOACAGAOAA 
TGAG6AAACT 
CCTCTGGCCG 
GTGCCACCAG 
CAAATTAGAT 
TGCCTGTGCA 
GGGGGCTCAT 
AAGGCAGTGT 
GTGAATQTQT 
GC3TGCCT0TT 
ACAGTTTTTT 
G1X3ATGA6AG 
TATTTTGTTT 
A6CCATTCTA 
AGTGATAGGA 
AGTOAGCCGC 
CTTTTTAAAT 
TCTGTCAGCC 
CCAOQTCCCC 
GATGGATTTG 
GCTGGAAACC 



11 

I 

T6AATCGCG6 
CCTOCCTGGC 
TT6QA6QQCT 
ACTGAGAACG 
QAGCCAGATG 
TCTQTCAAOA 
ASAGCCAAGA 
GCGAAGAAAG 
GAGCTGCCTG 
CCTTCCTGTG 
GTTTCAACTG 
GCGGGTGCTG 
TTTTGCTOTT 
CCCTTTTGCT 
CTGGACCTCA 
GAATCTGAGC' 
TQTTOTTGTQ 
AATGGAGACA 
GAATTGTTAA 
AOTCATTGGG 
AG06TCTGGC 
GGGGCACATG 
GACTTGCCTC 
CAACCTTCAC 
OCTTTCTTTG 
ATTOGCCCTC 
TCTGGAGGTC 



21 
I 

QACCOGTTGG 
AGCCCTTTCT 
QCGCCTGOVC 
AGCCAGACTT 
ACGACCCCAT 
AGCAGTTTGA 
ACAAAATTGC 
TGCGCCGTGC 
GTCCCAGAGT 
GGCCCCTTAG 
TGCTCCTGTT 
CTGGTAACAG 
TTGATTCCCG 
AGA6CTGACA 
TGTTGTTGAG 
TGCAQGTTCC 
TTTTTTTGTT 
GAGTCCCTGG 
TTCACAOAAT 
6AAA0GQGGT 
AGATACTCCr 
CTGGCCGCTC 
GA TGCT GTGG 
ATCT6TCA0G 
GAGGOftOCAO 
CTOCCTfflCA 
ATCTOGGCTG 



31 

I 

CAGAGGTGGC 
CAAGGACCAC 
CCGGGAGCGG 
GGCCCAGTGT 
AGAGGAACAT 
AGAATTAACC 
AAAGGAAACC 
CATCGAGCAG 
GGCTGCACCA 
CAATGTCTTA 
TTGTCTTGAA 
TGGCTGCTTC 
GGCTTACCAQ 
GCTTTOTTOG 
GCTGTCACAG 
TTATCTGTCA 
TTTTTTTTTT 
CTCCTCTACT 
A6CACAAACT 
GAACTTCAOG 
TTTGCCACTG 
CTCCCTCAGA 
GGGACTGGCT 
TTCTCCACAC 
CrCCCGCAGO 
TAGAGCTGCA 
TTCCTGftGAA 



41 

I 

GQOGGCGGCA 
CGCATCTCTA 
ATGGC06AGG 
TTCTTCTGCT 
AAAA AGCATT 
CTTGGTGAAT 
AACAATAA6A 
CTGGCTGCCA 
CTTCCAGGGT 
GQAAAGGAGA 
AGTG6CAGCA 
TCTCTCTCTC 
GTGAGAAGTG 
G6TGGGCAGA 
TGCTGAGTGT 
CACCTGTGCC 
GGTAGATGCA 
GTTTAACAAC 
ACAATTAAAA 
TGGAXGAOGA 
CT6TGIGATT 
AAAAG6CAGT 
GGGCTGCTGC 
GGGGGAQAGA 
OCTQAAGTCT 
GQOTQGATTG 
AZAAAAA6CC 



51 

I 

TGGGTGCCCC 
CATTCAAGAA 
CTGGCTTCAT 
TCAAGGAGCT 
CGTCOGGTTG 
TTTTGAAACT 
AGAAAGAATT 
TGGATTGAQG 
TTATTCCCTO 
TCAACATTTT 
6AGGT6CTTC 
TCTCTTTTTT 
AGGGAGGAAG 
6CCTTCCAGA 
GGACTTGGGA 
TCCICAGAGG 
TGACTTGTGT 
ATGGCTTTCT 
CTAAGCACAA 
GACAOAATAO 
AGACAGGCCX: 
GGCCTAAATC 
AQGCCGTGTG 
CGCAGTCC6C 
GOOSTAAOAT 
TTACAGCTTC 
TGTCATTTC 



Seq ZD NO I 146 Protein sequence: 
Protein Accession #: NP 001159 



1 11 21 31 41 51 

1 I I I 1 i 

N6AFTLPPAH QFFIiKDBRlS TFKNWPFIiBG CACTPERMAE AfiPIHCPTSN EPDL AQCP PC 
PKBLBGWEPD DDPIEEBKRH 6SGCAFLSVK KQFEELTLGE FLKUSRERAK NKIAKEHQOC 

KKBFEETAKK VRRAIEQLAA MD 

seq ID NOt 147 DNA sequence 

nucleic Acid AccesBlon #t im_014176.1 

Coding sequence: 127-720 



1 

I 

GCQ06CM30G 
AGT6CATCCC 
GGGATCATGC 
CCCCCAGGCA 
TTAG6T6GAG 
GAGAQ6TACC 
ATTGATTCTG 
AGACCATCCC 
AACCCTGATG 
TTCCTCAAGA 
GAGGAAGAGA 
CAGAAAAGGA 
G6GACTTGTC 
ACCTTGAATT. 
GTACATATGT 



11 
I 

CTGGTACCCC 
AGGCAGCTCT 
AGAGAGCrrC 
TCACATGTTG 
CCAACACACC 
CATTTGAACC 
CTGGAAGGAT 
TCAACATOGC 
ACCCGCTCAT 
ATGCCAGACA 
TGCTTQATAA 
AGGCCAGTCA 
CTGGTTCATC 
TTTTTTTAAA 
ATTTTGAAAT 



21 
I 

GTT6GTCC6C 
TA6TGTGGAG 
ACGTCTGAAO 
GCAA6ATAAA 
TTATGAGAAA 
TCCTCAGATC 
TTGTCTGQAT 
AACTGTQTTG 
GGCTGACATA 
GTGGACAGAG 
TCTACCAGAG 
GCTAGTAGGC 
TTAGTTAATG 
TATATTTGAT 
CTTTTAAACC 



31 
I 

GCQTTGCTGC 
CAGTGAACTG 
AGAGAGCTGC 
GACCAAATGG 
GGT6TTTTTA 
CGATTTCTCA 
GTTCTCAAAT 
AOCTCTATTC 
TCCTCAGAAT 
AAGCAT6CAA 
GCTGGTGACT 
ATA6AAAAOA 
TGTTCTTTGC 
GACATAATTT 
TGAAAAATAA 




ACATGTTAGC 
ATGACCTGCG 
AGCTAGAAGT 
CTCCAATTTA 
TGCCACCAAA 
AGCTGCTCAT 
TTAAATATAA 
6ACAGAAACA 
CCAGAGTACA 
AATTTCATCC 
CAAGGTGATC 
TTGTQTAGTT 
ATAGTCATTT 



SI • 

I 

GTQTCAGCTC 
CTTCTACTTG 
CACAGAGCCA 
AGCTCAAATA 
TATCATTCCT 
TCATCCAAAC 
AGGTGCTTGG 
GTCAGAACCC 
TAA6CCAGCC 
AAAGGCTGAT 
CAACTCAACA 
TGATGTTTAG 
TAAGTTGCCT 
TATTTATCTT 
AATQT3X3AAA 



1B60 
1920 
1980 
2040 
2100 
2160 
2220 



PCTAJS02/12476 



60 
120 
180 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 



60 
120 



60 
120 
180 
240 
300 
360 
420 
460 
540 
600 
660 
720 
780 
840 
900 



242 



5 

10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



WO 02/086443 

AAAAAAAAAA AAAAAAAAAA AAAAAAAA 

Seq ID NO I 148 protein sequence I 
Protein Accession #: NP_054e9S.l 

1 11 21 31 41 51 

I I I I I I 

MQRASRLKRE LHMLATEPPF GITCWQDKDQ MDDLRAQILG GANTFYEKGV FKLEVIIPER 
YPFEPPQIRF LTPI'VBPNID SAORICLDVL KLPPKGKWRP SLNIATVI.TS IQLLMSEPilP 
DDPZiNADISS BFKniKPAFL KKARQWTBKH ARQKQECADBB BKLDNIiPEAG OSRVHNSTQK 
RKASQLVGIB KKFHPDV 

Seq ID NO: 149 DNA sequence 
Nucleic Acid Acceseion #: NM_003812 
Ooding sequence I 224-2722 



PCT/US02/12476 



TCCTCTGQ6T 
GCdCAGCtXIC 
CCATGC606C 
GCTCTCGCCX5 
CAGCTCGGGG 
AGGCGGCCCC 
GCTTCTCGTC 
GGCTGCTGCG 
66CAGAT6AA 
AATGCAGAAA 
AAGCCCTTAT 
CCATCTGGCC 
CATACTGAAC 
ACCACAGTAC 
AGACTCCAAQ 
CTTCGTGTAT 
ACATATAATC 
GOAAAGAGGT 
AGCASTSAAT 
TAATGATCAC 
AAAGTCCGTG 
CCT6GTGGCT 
GCAGATGCTC 
GCACCTCATC 
TGTCTCTTCT 
ACAAC3TATTA 
AAAGCCAAAA 
GTCCCATTCT 
AGQAGGTGQA 
AAAT66ATAC 
ATTATGCTGT 
TAACAATACC 
GTGTGATATT 
GCAAGAOGGA 
CAGAGACAAC 
CTATGAAAA6 
GTGGATTCAG 
TCX3AGCTCCA 
AG6CCX3GGT6 
CTATGTAGAA 
ACAAATTCAA 
GGGCCATGGG 
AGATTGCAGT 
GG6TCCTA6T 
TATTGTCCTT 
TACTCAGCAA 
ATTCTGGGTA 
CTTTGGGTGG 
CTGTCTCTTT 
GQGQQCAAAA 
ACGAAGGAAC 



11 

I 

CXXGCCCCGG 
GAGCCCOGGQ 
06AGC06G08 
GCCACACGGA 
CAGCCGCCCC 
GCGGGCTCGG 
CTTCTCCTGC 
CCCAGCGCTC 
6ACAATACAT 
GAAATCACAC 
CACGTTCTTG 
CAOOCAAOCT 
AAT6GTTTGT 
TCTAAGGGTO 
GTGGCTCTGT 
ATGATAGAGC 
CAGAAAACCT 
GACCAGT6GC 
CCATCAOGTG 
AAAACGTATA 
GTCAACCTTG 
GTA6AGACCT 
CATGAGTTCT 
TOGGGGGTGA 
CGCACAAGAG 
TG6CAGA6CC 
TQTQACTGCA 
CQAAAATTTT 
GCCTGCCTTT 
GTG6AAGCIG 
AAGAAAT6TT 
TCATGTCTTT 
ACTGAATATT 
TATGCATGCA 
CAGTGTCAGT 
CTGAATACAG 
TGCAGCAAAC 
CGTATTGGTC 
ATTGACTGCA 
GA!r6GAA0QC 
GCCCTAAATA 
GT6TGTAGTA 
ATCGOSGATC 
GCCACCAATC 
GGGGGCAOU3 
GGCCCCATCT 
TGACATACTC 
TAATGACTAC 
TG6AAATAAT 
GACCATQCTA 
AACACACACA 



21 
I 

GAGT6GCTGC 
CCCCGTGCCC 
TGAGC66CTC 
GCX3GC6CCC6 
TGGOGGGCTG 
TGCCTGCCA6 
TOCCTCCGCT 
CGCATTGGAA 
TGCAACAGAA 
TGCCTTCAAG 
ACACAAAG6C 
TCCAGATTOA 
TGTCTTCTOA 
GAGAGCACTG 
CAACCT6CAA 
GACIAGA6CT 
T GGOkG GACA 
CCTTTCTCTC 
GTATATTTQA 
AGAAGCATCG 
TGGATTCTAT 
GGACTGAGAA 
CAAAATACCG 
CATTTCACTA 
6AGTTGGTGT 
T6GCTCAAAA 
CAGAATCCTG 
CAAAGTGCAG 
TCAACAOGCC 
6GGAGGA6IG 
CCCTCTCCAA 
TTCAGCCACG 
GTACTGGAfiA 
ATCAAAATCA 
ACATCT66G6 
AAG6CACTGA 
ATGATGTGTT 
AACTTCAGGG 
GT6GTG0CCA 
CAT8TG60CC 
T6AGCAGCTG 
ATGAAGCCAC 
CAGTTAGGAA 
TCATAATAGG 
GCTGGGGATT 
GAATCA6CTG 
GCAGCAGTGT 
GGAGCTAAAG 
GTCAAAGAAC 
TAAAAAGAAC 
CAAAAATTAA 



31 
I 

GAGGCTAGGC 
GQAGCOCGQA 
CGC0CGCG6C 
GGAGCTATGA 
CAGCCTTGCC 
CGCCCCX3GCC 
OGCOSCXTrOS 
TQAAACTGCA 
TA6CAGCAGT 
ACTCATATAT 
AAGACACCAG 
AGCCTTOOGC 
TTATGT6GAG 
TTACTACCAT 
TGGACTTCAT 
G6TTCATGAT 
GTATTCTAAG 
TGAATTACAG 
AGAAAT6AAA 
CTCTTCTCAT 
TTACAAGGAO 
GGATCAGATT 
QCAGCGCATT 
TAAGAGAAGC 
GAATGAGTAT 
CCTTGGAATC 
GGGTG6CTGC 
CATTTTGGAG 
AACAAAGCIA 
TGATTGTGGT 
OGGGGCTCAC 
AGGGTATGAA 
CTCrGGTCAG 
GGGC06CTQC 
AACAAAGOCT 
GAAGGOAAAC 
CTGTGGATTC 
TGAGATCATT 
TGTAGTTTTA 
GTGTATGATQ 
TCCACTOGAT 
CTGCATTTGT 
CCTTCACGCC 
CTCCATCGCT 
TAAAAATGTC 
08CTGGATGQ 
TACTGGAACT 
TT6GGGTGAC 
ACCTTTCACC 
TGTTCCAGAA 
ATGCAATAAA 



41 

I 

GAGCCGGGAA 
GCCCCCTGOC 
OSCCCOGCAG 
6CCATGAAGC 
GGCGCTTCCT 
GGCACGCCGC 
TCCOSGCCCC 
GAAAAAAATT 
AATATCAGTT 
TACATCAACC 
CAAAAACATA 
TCCAAATTCA 
ATTCACTAOG 
6GAAGCATCA 
GGCATGTTTG 
GA6AAAAGCA 
CAAAT6AAGA 
TG6TTGAAAA 
TATTTGGAAC 
GCACATACCA 
CAGCTCAACA 
QACATCACCA 
AAGCAGCATG 
A6TCTGA6TT 
GGTCTTCCAA 
CAATGGGAAC 
ATCATGGAGG 
TATAGAGACT 
TTTGA6CCCA 
TTTCATGTOO 
TGCAGC6AC6 
TGCOSGGATG 
TGCCCACCAA 
TACAATG60G 
GCAGGGTCTG 
T60SG6AAGG 
TTACTCTSTA 
CCAACTTCCT 
GATGATGATA 
TGTTTAGATC 
TCCAAGGGTA 
GATTTCACCT 
CCCAAGGATG 
GGTGCCATCX: 
AAC»AGA6AA 
ACAGGGCCTT 
ATTAAGTTTG 
AAGGATGGGG 
ACCTGTCAGT 
TCTTTrTTTT 
GGAATCATTA 



AG6GGG0GCC 



CTAGCCC3GGC 
CGCCCGGCA6 
GCG6CCCCCA 
CCTGCCGCCT 
GOOCCTGGGG 
TGGGAGTCCT 
ACA6CAATGC 
AAGACTOGGA 
ATAAGGCTGT 
TTCTTGACCT 
AAAAT6GGAA 
GAG6CGTCAA 
AAGATGATAC 
CAGGTCX3ACC 
ATCTCACTAT 
GAAGGAAGA6 
TTAT6ATTGT 
ACAACTTTGC 
CCAGGGTTGT 
CCAACCCTGT 
CTGATGCTGT 
ACTTTGGAG6 
TGGCAGTGGC 
CTTCTAGCAG 
AAACAGGGGT 
TTTTACAGA6 
OGGAATGTGG 
AATGCTAT6G 
GGCOCTGCTG 
CTGTGAAGGA 
ATCTTCATAA 
AGTGCAAGAC 
ACAAGTTCTG 
ATGGAGAC06 
CCAATCTTAC 
TCTACCATCA 
OGGATGTGGG 
GGAA GTGCC T 
AAGTGTGTTC 
GGGCAGGGAC 
AAGGACCCAA 
TGGTAGCA6C 
GGTTCXSITCC 
6CACTGTTGG 
TAAACAAAAC 
TAAAAGAAAA 
AAAG666GGA 
7CCCTAATGG 



Seq ID KO: 150 Protein sequence: 
Protein Accession #: NP 003B03 



MKPPGSSSRQ 
RPRAWGAAAP 
INQDSESPYH 
RYENGKPQYS 
KSTGRPHZIQ 
LESjMIVIIDHK 
ITTNPVQMLH 
LPNAVAQVIjS 
RDFLQRGGGA 
SD6PCCNNTS 
NGBCKTRDHQ 



11 
1 

PPLAGCSLAG 
8APHWNETAE 
VLDTKARHQQ 
KGGEHCYYHG 
KTLAGQYSKQ 



21 
I 

ASC?GPQRGPA 
KNIiGVLADED 
KHNKAVHLAQ 
SIR6VKDSKV 



EFSKYRQRIK 
QSLAQNLGXQ 
CLFNRPTKLF 
CLFQFRGYEC 
CQyZWGTKAA 



HTNNFAKSW 
QHADAVHLIS 
W£PSSRKPKC 
EPTECGNGYV 
BDAVNBCDIT 
GSOKFCYEKL 



31 

I 

QSVPASAPAR 
NTLQgNSSSK 
ASFQIEAFGS 
ALSTCNGLHG 
QWPFLSELQW 
NLVDSIYKBg 
RVTFHYICRSS 
DCTESWGGCI 
EA6EECDC6F 
KYCTGDSGQC 
MTBOTfiKGNC 



41 
I 

TPPCRLLLVL 
ISYSNAMQKE 
KFILDLILNH 
MFEDDTFVYH 
LKRRKRAVNP 
LNTRWLVAV 
LSYFGGVCSR 
MEBTGVSHSR 
HVBCYGLCCK 
PPNIiKKQDGY 
GKIXaiRWIQC 



51 

1 

LLLPPLAAS8 
ITLPSRLIYY 
GLLSSDYVEI 
ISPLEbVHDE 
SR6IFBENKY 
BTWTEKDQID 
TRGVGVNEYG 
KPSKCSILEY 
KCSLSNGAHC 
ACNQNQGRCY 
SECHDVFOQPL 



60 
120 
180 



60 
120 
180 
240 
300 
360 
420 
4B0 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
13B0 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 



60 
120 
160 
240 
300 
360 
420 
480 
540 
600 
660 



243 



IiCTMLTRAPR IGQLQGBHP TBFYHQGRVl OC8GARWLD DDTDWSYVED QTPGGFSMHC 
LDRKCLQIOA U3MSSCFLDS K6RVC86HSV CSNEATCZCD FTKAGTDCSZ RDPVHNXaPP 
KDB6PKGPSA TNLZIGSIAG AZLVAAIVU3 GTQWGFKNVK KRRFDPTQQ6 PI 



720 
780 



Seq ID NO: 151 DNA sequence 
Nucleic Acid Accession ft: iaM_02391S 
Coding sequences 250-1326 

1 11 21 31 41 51 

6GCACX5AGGG TTTCGTTTTC ATGCTTTACC AGAAAATCCA CTTCCCTQCC QACCTTAeTT 60 
TCAAAGCTTA TTCTTAATTA GAGACAAGAA ACCTGTTTCA ACTTGAAGAC ACCGTATGAG 120 
GTGAATGGAC AGCCAGCCAC CACAATGAAA GAAATCAAAC CAGGAATAAC CTATGCTGAA 180 
CCCACGCCTC AATOGTOCCC AAGTGTTTCX: TGACAC6CAT CTTTGCTTAC AGTGCATCAC 240 
AACTGAAGAA TGGG6TTCAA CTTGAOGCTT GCRAAATTAC CAAATAACGA GCT GCAOSG C 300 
CAAGAGAGTC ACAATTCAGG CAACAGGAOC GACGGGCCAG GAAAQAACAC CACCCTTCAC 360 
AATGAATTTG ACACAATTGT CTTGCCGGTG CTTTATCTCA TTATATTTGT GGCAAGCATC 420 
TTGCTGAATG GTTTAGCAGT GTGOATCTTC TTCCACATrA GGAATAAAAC CAGCTTCATA 4 BO 
TTCTATCTCA AAAACATAGT GGTTOCAGAC CTCATAATGA CX3CTGACATT TCCATTTCX5A 540 
ATA0TCCAT6 ATGCftGGATT TGGACCTTGO TACTTCAAGT TTATTCTCT6 CAGATACACT 600 
TCRGTTTPGT TTTATGCAAA CATGTATACT TCCATCGTGT TCCTTGGGCT OATAAGCATT 660 
GATCGCTATC TGAAGGTGGT CAAGCCATTT GGGGACTCTC GGATGTACA6 CS^TAACCTTC 720 
AOGAAGGTTT TATCTGTTTG TGTTTGGGTG ATCATGGCTG TTTTGTCTTT GCCAAACATC 780 
ATCCTGACAA ATGGTCAGCC AACAGAGGAC AATATCCATG ACTGCTCAAA ACTTAAAAGT 840 
CCTTT0GGG6 TCAAATGGCA TACGGCAGTC AOCTATGTGA ACAGCTGCTT GTTTGTGGCC 900 
GTGCTGGTGA TTCTCATCGG ATGTTACATA GCCATATCCA GGTAC»TCCA CAAATCCAGC 960 

AGGCAATTCA TAAGTCAGTC AAGCCX3AAAG CGAAAACATA ACCAGA6CAT CAGGGTTGTT 1020 

GTGGCTGTGT TTTTTACCTG CTTTCTACCA TATGACTTGT GCAGAATTCC TTTTACTTTT 1080 

A6TCACTTAG ACAGGCTTTT AGATGAATCT GCACAAAAAA TCCTATATTA CTGCAA AGAA 1140 

ATTACRCTTT TCTTGTCTOC GTGTAATOTT TGCCTGGATC CAATAATTTA CTTTTTCATG 1200 

TGTAGGTCAT TTTCftAGAAG GCTGTTCftAA AAATCAAATA TCAGAACCAG GAGTGAAAGC 1260 

ATCAGATC3VC TGCAAAGTGT GAGAAQATCG GftfiOTTCGCA TATATTATGA TTACACT6AT 1320 

GTGTAGGCXrr TTTATTGTTT QTTGGAATOQ ATATGTACAA AGTGTAAATA AATGTTTCTT 1380 
TTCATTATCC TTAAAAAAAA AA 



Seq ID NO I 152 Protein sequence: 
protein Accession #: NP_076404 

1 H 21 31 41 51 

MGFNLTLAKL PNNELHGQES HNSGNRSD6P GKNTTLHMBF DTIVLPVLYIi IIFVASIIiIiN 60 

GLAVWIFFHI RNKTSFIFYL KNIWADblM TLTFPPRIVH DAQFGPHYFK FILCRYTSVIi 120 

FYANMYTSIV FLGLISIDRY LKWKPPGDS RMYS1TPT1C7 LSVCVWVIMA Vl*SLPNIIItT 180 

HGQPTBDNIH DCSKtKSPLG VKMHTAVTYV NSCLFVAVLV IIiIGCVIAIS RYIKKSSROP 240 

I6QSSRKRKB NQSIRVWAV FFTCPIiPYHL CRIPFTFSHL DRLLDESAQK IIiYYCaCEITL 300 
PLSACMVCLD PIIYPFHCRS PSRRLFKKSW IRTRSESIRS I^SVRRSBVR IVYDYTDV 

seq ID NOt 153 DNA sequence 
Nucleic Acid Accession #: D80008.1 
coding sequences 149-739 

1 11 21 31 41 51 

GTTCGGCGCC AAA6CGCGGA GOGGAGGCOG AGG06AGAGC CTGGCGCTGT AGGACTAGAA 60 

CGAAAG6AGT GAGGCGCCGA GAGCCCAGAT ACCATTTTGG CCJTGAGAGCT GGTGGTTGGC 120 

AAGGCOGOGG GAGTQGGAAG CGTCCGCCAT GTTCTG06AA AAAGCCATGG AACTGATCOG 180 

QSAGCIQGAT GS08GQ0CCG AAGGGCAACT GCCTGCCTTC AACGAQGATO GACTCAQACA 240 

AGTTCTGGA6 GAGATGAAAG CTTTGTATQA ACAAAAOCAG TCTQATGTGA ATGAAQCAAA 300 

GTCAGGTGQA CGAAGTGATT TGATACCAAC TATCAAATTT CGACACTGTT CTCTGTTAAQ 360 

AAATCGAOQC TGCACTGTAG CATACCTGTA TGACOGCTTG CTTCGGATCA GAGCACTCAG 420 

ATG6GAATAT GGTAGOGTCT TGCCAAATGC ATTAC6ATTT CACATG6CTG CTGAAGAAAT 480 

GOMTGOTTT AATAATTATA AAAOATCTCT TGCTACXTAT ATGAGOtCAC TQGQAGGAGA 540 

TOAAGGTTTG GACATTACAC AQGATATGAA ACCACCAAAA AGOCTATATA TTGAAGTCCG 600 

GTGTCTAAAA GACTATGGAG AATTTGAAGT TGATGATGGC ACTTCAGTCC TATTAAAAAA 660 

AAATAGCCAG CACTTTrTAC CTCGATGQAA ATGTGAGCAG CTGATCAGAC AAQ6AGTCCT 720 

GGAGCACATC CTGTCATGAC CATGCGCOQA G6C3WOTCCA GG CTTC ACTC AACTCATGGA 780 

CTCCTCrGTA CTCACTCrCT CCACCACTOC CTTCAOCTCC CTCTTTGATT TTAGAA6CTA 840 

TAGACATTCT TTAAGATAAC TAAGAATACT TGGCTAAGAA GTATAATTTG CTAACTATTA 900 

AGGACTTTCT TTTTTTAATG TTQTACACTA TTCTTCCTAC TCTTTTTTGG TTTTGGTTTT 960 

GTTTTGTAGA GACTGTCTCA CTATGTTGCC CAAGCTGGTC TCAAACTCCT GGCXITCAAGC 1020 

AOTCCTCCCA CCTTAGCTTC TCAAAGTGTT GAGATCACAG GCGTQA6CCA CTGCACCCQG 1080 

COCCTACTOC TTTTTCTAAT AA6CTGTATC TGTAATCACA GCATTOCTAC AGTT6TTACA 1140 

GTOTOTTTTT TAAATGAAAG TAAACATGGT TACATTPGAA TCTCTTAAAT AAGCAGTCAC 1200 

TTGGCIGGAC AGGAAQAAGG TAGATCCPGT GTGTCTTGTT TTCTGGTCAT GTGTATTGTA 1260 

CAAGCTAGAG AGCTGAATTT CTGAGATACA CATTTTCAAA TCACATGCAA GTGAAGATGA 1320 

TGGTCTGTAG AAATTTTCAG TATATATAAT GTTTAATGAC ATACTAATTT ATCATCTGGC 1380 

TATTTGGGAA GGAAGGACAC ACATGGATTT TGCACATTTC CACCATGGTG GCTGGTGTGG 1440 

CTTGTGGCTA T6GGGTGATC ACCAGTATCA CCACTTTGGA AGGGGACAGT GAAATTGGGG 1500 

CTAGAGAAGG AACTTTGTAC AGTTTTCCCT GAGATTCAGA TT GACTG AAA AGTC ACATGA 1560 

AGAGTTGATT GTCTTTTAAT GGTATGTTTT AAACAGCTGA CATTTTAAAT T TTGATG AAA 1620 

TCCAGTTTAT TCGTTTGTTC TTTTATGCTT TGGGTGTTGC ATCQGAGAAA TCTTTTCCCA 1680 

TCCCAAGATC ACAATTTTTT TTCCTTTTTA CTTCTAGAAG T6TTATAATT TT AAGCT TTA 1740 

TACTTTGGTC TATGACCC6T ITrmiTTT GTTTTGTTTT GTTTTTTCGT TTGTTTCTTT 1800 

GTTTTGA6AT GQAGTCTTGT TCTGTCACCC AGGCTG6GGT GCAGTGGCGT GATCTTGQCT 1860 

CACTGCAATC TCTATCCCCT GGGTTCAAGT QATT Crcr TO TCTCAGCCTC CCAA6TAGCT 1920 

GGGATTACAG GCACAGGCCG CCACGCCTGG CTAATTTTTO TATTTTTAQT AGAGACAGAG 1980 



244 



10 
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WO 02/086443 

TTTTACO^TG TT6GCCA6GC T6STTTCAAA CTCCTGACCT CAAGTGACCC 
CCCAAAGTTT TGG6ATTACA A6T6TGG6CC ACOGCGGCCA GCCTATGATC 
■ GAATTTTTTA TATGGTGCAA C3GTGTCAATC CACCTTCACT TTTTCTTGGG 
TCC3«3CTGTT TCACTACCAT TTTTTGAAAG GACTGCCCTT TGCTCTATCA 
TTTGTTAAAA AGTAGnGTC AATGTATATQ TGGGTTTATT TCAGGACTCT 
ATTGACCT6T TTTTCTCTCC 1GAATGCC3UV TACCaVTATTI GTATGTAGT6 
TCTAATAATT CTTGAAACAG ATAGTATTAA TGTGTCATAT TTTTGCTGTT 
TTTGTAGAGA TGQGGTTTCA CCGTGTTGGC CAGGCTGTGT TGAACTCCTQ 
ATACACTTGC CTCGTCCTCC CCATGTGCT6 G6ATTACAGG OGTGAGCCTT 
CftGTGTACCA CATTTCTTTT TGAGATTTGT TTTGGCTATO TTAAGTCCTT 
GTGAAATTTG GGAACAGGCA GGGTGTGGTG QCTTATGCCT GTAATCCTAG 
GGCCTAGATG GGTGGATCAC TTGAGCTCA6 GAGTTCCAGA CCAGCCC6GG 
AACTCCGTCT CTACAAAAAA TA6AAAAAAT TAGCCAGGTG TGGTGGTGCA 
CACAGTTACA CXIGCSVaGCTG AGGTGGGAGG ATO^CTTGAA CCCCAGAGGT 
GTGAfiCTCSAO ATCACAOCAC TGTACTGCA6 CCTQGQTGAC AAAGTGASAC 
AAAGAAATTA 6GATCAATTT 6TCAATTTCT ACAAGAACAA CAACAAAAAC 
CACCTTGATT QAGATTGCAT TGAATTTATA TAAAACTGTT GGGAGAATTG 
AATATTGAGT CTTCTGGCCT ATAAACAAGG TCTGTCTTCC TAGGTATTAA 
TCTATTTCTC TTAATAATCT TTTGTAGTTT TCAGTGTACA GGTCTACCAT 
CATAGTTTTO ATOCTAAATO GTATTTTAAA ATTTCAAATT CTAACCACTT 
AATAGAAATA CAATT6ATGT T6AACTTGTA TCCT TCftGCC TT6CTAAACT 
ATGGT6TTTT TGTAAATTAC ATCAACA8TC ATGTGTTCTA TGAATAAAiQA 
TTC 

Seq ID NO: 154 Protein sequences 
Protein Accession #t BAA11503.1 

11 



ACCTTG6CCT 
CATTTTGAAT 
AATATAGATA 
CCTTTGCATT 
GTTTTGTTCC 
TATGTAATTT 
GTTTGTATTT 
AGCTAAAGCA 
GG TGCT GGCC 
TGCTTTTGAT 
AACTTTGGGA 
CCTATGGCAA 
TGCCTGTAGT 
CAAGACTGCA 
TCT ATCTC AA 
CCCTGTTGGG 
ACATCTTAAT 
TGTTTTGTCT 
GTCAGCATTT 
GTTGCTAGTA 
QT8AGTTCTC 
GTTTTACTOC 



1 11 21 31 41 51 

I I I I I I 

MFCEKAMBLI RBLHRAPEGQ LPAFNEDGLR QVLEEMKALY EQKQSDVNEA KSGGRSOLIP 
TXKFRHCSLL lOntRCTVAYIi YDRUiRXRAL RWEYGSVLPN AIiRFKMAAEB MEWFKKYKR6 
LATYHRSL6G DEGLDITQDM KPPKSIiYIEV RCLKDYGEFE VDDGTSVLIiK XNSQHPLFRW 
KCEQLIRQ6V IiSBILS 

Seq ID NO I 155 zniA sequence 

Nucleic Acid Accession #: Eos sequence 

Coding sequence: 149-709 



1 

i 

GTTCGGCGCC 
CGAAAGGAGT 
AAflGCCGOGG 
OGAGCTGCAT 
AGTTCTGGAG 
GTCAGGTGGA 
AAATCGAOGC 
ATCGGAATAT 
GGAGTGGTTT 
TGAAGGTTTG 
ATGC3«3TGGC 
CSU^CCTCCAC 
GCACTTCAGT 
A6CT6ATCA6 
CAGQCTTCAC 
CCCTCTTTQA 
AAOTATAATT 
ACTCTTTTTT 
TCTCAAACrC 
A6GCGTGAGC 
CAGCATTCCT 
AATCTCTTAA 

AATCACATGC 
ACATACTAAT 
TCCACCATGG 
GAAGGQGACA 
GATTGACTGA 
GACATTTTAA 
GCATCCGAGA 
AGTGTTATAA 
TTGTTTTTTC 
GTGCAGTGGC 
TGTCTCAGCC 
TGTATTTTTA 
CTCAAGTGAC 
CAGCCTATGA 

TTTGCTCTAT 
TTTCAGGACT 
TTGTATQTAG 
ATTTTT6CTG 
GTTGAACTCC 
GGCGTGAGCC 
TGTTAA6TCC 
CTOTAATOCT 
QACX3U3CCOS 



11 
1 

AAAGCGCX36A 
GAGGGGCCGA 
(SAOTGOGAAO 
C60SG6CG06 
GAGATGAAAG 
CGAAGTGATT 
TGCACTGTAG 
GGTAJQOSTCT 
AATAATTATA 
GACATTACAC 
GCGATCTCGG 
CTCCCAGGTC 
CCTATTAAAA 
ACAAGGAGTC 
TCAACTCATG 
TTTTAOAAGC 
TGCTAACTAT 
G Gr r T TG O l T 

craaccTCAA 

CACTGCACCC 
ACAGTTGTTA 
ATAAGCA6TC 
AT6TGTATTG 
AAGTGAAGAT 
TTATCATCTG 
T6GCTGGTGT 
GTGAAATTGG 
AAAGTCACAT 
ATTTTGATGA 
AATCTTTTCC 
TTTTAAGCTT 
GTTTGTTTCT 
GTGATCTTGG 
TCCCAAGTAG 
GTAGAGACA6 
CCACCTTGGC 
TCCATTTTGA 
GGAATATAGA 
CACCTTTGCA 
CTGTTTTGTT 
TQTAT6TAAT 
TTGTTT6TAT 
TGAGCTAAAG 
TTGGTGCTGG 
TTTGCTTTTG 
AGAACTTTG6 
6GCCTATGGC 



21 

I 

GCGGAGGCC6 
GAGCXXAGAT 
CGTCOGKICAT 
AAGGGCAACT 
CTTTGTATGA 
TGATACCAAC 
CATACCTGTA 
TGGCAAATQC 
AAA6ATCTCT 
AG6ATATGAA 
CTCAACCTGC 
CG6TGTCTAA 
AAAAATA8CC 
CTG6AGCACA 
GACTCCTCTG 
TATAGACATT 
TAAGGACTTT 
TTGTTTTGTA 
GCAGTCCTCC 
GGCCCCTACT 
CAGTGTGTTT 
ACTTGGCTGG 
TACAAGCTAG 
QATGGTCTGT 
6CTATTTGGG 
GGCTTGTGGC 
GGCTAGAGAA 
GAAGAGTTGA 
AATCCAGTTT 
CATCCCAAGA 
TATACTTTGG 
TTGTTTTGAG 
CTCACTGCRA 
CTGGGATTAC 
AGTTTTACCA 
CTCCCAAAGT 
ATGAATTTTT 
TATCCAGCTG 
TTTTTGTTAA 
CCATTGACCT 
TTTCTAA1AA 
TTTTTGTAGA 
CAATACACTT 
CCCAGTGTAC 
ATGTGAAATT 
GAGGCC TAflA 
AAAACTCOGT 



31 

I 

AGGCGAGAGC 
ACCATTTTGG 
GTTCTGCGAA 
GCCTGCCTTC 
ACAAAACCAG 
TATCAAATTT 
TGACCGCTTG 
ATTAOGATTT 
TGCTACTTAT 
ACCACCAAAA 
AACCTCCACC 
AAGACTATGG 
AQCACTTTTT 
TCCTGTCATG 
TACTCACTCT 
GTTTAAGATA 
CTTTTTTTAA 
QA6ACTGTCT 
CACCTTAGCT 
CXTTTTTTCTA 
TTTAAATGAA 
ACAGGAAGAA 
AGAG CTGAA T 
A6AAATTTTC 
AAGGAAGGAC 
TATGGGGTGA 
GGAACTTTGT 
TTGTCTTTTA 

TCACAATTTT 
TCTATGACOC 
ATGGAGTCTT 
TCTCTATCCC 
AGGCACAGGC 
TGTTG6CCAG 
TTTGGGATTA 
TATATGGT6C 
TTTCACTACC 
AAAGTAGTTG 
GTTTTTCTCT 
TTCtTQAAAC 
GAT6G66TTT 
GCCTCGTCCT 
CACATTTCTT 
T6G6AACAGG 
TGOGTGQATC 
CTCTACAAAA 



41 

I 

CTGGCGCTGT 
CGT6AGAGCT 
AAA6CXATG6 
AAOGAGGATG 
TCTGATGTQA 
CGACACTGTT 
CTTCGC3VTCA 
CACATGGCTG 
ATGA6GTCAC 
AGCCTATATA 
TCCX3VGGTTC 
AGAATTTGAA 
ACCTCQATGG 
ACCATGOGOC 
CTCCACCACT 
ACTAAGAATA 
TGTTGTACAC 
CACTATQTTG 
TCTCAAAOTG 
ATAAGCTGTA 
AGTAAACATG 
GGTAGATCCT 
TTCTGAGATA 
AGTATATATA 
ACACATGGAT 
TCACCA6TAT 
ACAGTTTTCC 
ATQQTATGTT 
TCTTTTATGC 
TTTTCCTTTT 

GT ' rrrrm ' T 

GTTCT6TCAC 
CrOGGTTCAA 
OGCCACGCCT 
GCTGGTTTCA 
CAA6XGTG6G 
AAGGTGTCAA 
ATTTTTTGAA 
TCAAT6TATA 
CCTGAATGCC 
AGATAOTATT 
CACOGTGTTG 
CCCCATOTGC 
TTTGAGATTT 
CAGG6TGTGG 
ACTTGAOCTC 
AATAGAAAAA 



2O40 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
25B0 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
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51 

1 

AGGACTAGAA 
GGTGGTTGGC 
AACTGATCC36 
GACTCAGACA 
ATGAAGGAAA 
CTCTGTTAAG 
GAGCACTCAG 
CT6AA6AAAT 
TGGGAGGAGA 
TTGAAGCTGG 
ACCTCAACTG 
GTT6ATGATG 
AAATGTGA6C 
GAGGCACTTC 
CCCTTCACCT 
CTTGQCTAAG 
TATTCTTCCT 
CCCAAGCTGG 
TTGAGATCAC 
TCTGTAATCA 
GTTACATTTG 
GTGTGTCTTG 
CACATTTTCA 
ATGTTTAATQ 
TTTGCACATT 
CACCACTTTG 
CTQAGATTCA 
TTAAACAGCT 
TTTGGQTGTT 
TACTTCTA6A 
TTCTTTTOTT 
CCAGGCTGGG 
GTGATTCTCr 
GGCTAATTTT 
AACTCCTGAC 
CCAOCGOGGC 
TCCACCTTCA 
AGGACTGCCC 
TGTGGGTTTA 
AATACXrATAT 
AATOTaiCAT 
6CXA66CTGT 
TGGGATTACA 
GTTTTGGCTA 
TGGCTTATGC 
AG6A6TTCCA 
ATTAGCCAGG 



60 
120 
180 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
22B0 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
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TGTGGTGGrrG CATGCCrGTA QTCACAGTTA CftGGGCAGGC TGAGGTIGGGA. G6A.TCACTT6 2880 

AACCCCAOAG QTCAAOACTG CAGTGAGCTG AGATCACACC ACTGTACTCC AGCCTtSGGTG 2940 

ACAAAGTGAG ACTCTATCTC AAAAAGAAAT TAGGATCAAT TTGTCAATTT CTACAACAAC 3000 

AACAACAAAA ACCCCTGTTG GGCACCTTCA TTOAGATTGC ATTGAATTTA TATAAAACTQ 3060 

5 TTOGGAOAAT TOACATCTTA ATAATATTCA QTCTTCXOGC CTATAA ACAA GGTCTGTCTT 3120 

CCTAGGTATT AATGTTTTGT CTTCTATTTC TCTTAATAAT CTTTTOTAGT TTTCAGTGTA 3180 

CAGGTCTACC ATGTCAGCAT TTCATAGTTT TOATGCTAAA TGGTATTTTA AAATTTCAAA 3240 

TTCTAACCAC TTGTTGCTAG TAAATAGAAA TACAATTGAT GTTGAACTTG TATCCTTCAO 3300 

CCTTGCTAAA CTGTGAGTTC TCATGGTGTT TTTOTAAATT ACATCAACAG TCATGTGTTC 3360 
10 TATGAATAAA GA5TTTTACT CCTTC 

6eq ID MOi 156 Protein sequence! 
Protein Accession #t Eos sequence 

15 I 11 21 31 41 SI 

11)111 
KPCEXAMELI RELKRAPBGQ LPAFNEDGLR QVLBEMKALY EQNQSDVNEA KSGGRSDLIP 60 
TIKFRHCSLL RNRRCTVAYL YDRLLRIRAL RWEYQSVLPN ALRFHMAAEE MBWFNNYKRS 120 
LATYMRSLGG DEGLDITQDM KPPKSLYIEA GCSGAISAQP ATSTSQVHLN QJLHLPQPVS 180 
20 KRIMRI 



Seq ID NO: 157 DMA sequence 
Nucleic Acid Accession l^t Bos sequence 
2j Coding sequence: 148-621 

1 11 21 31 41 51 

I I I I I I 

TTG66C96CCA AA60GC0GAG CQGAGGCOGA GGGGAGAGCC TGGOGCTGTA G6ACTAGAAC' 60 

GAAAGGAGTG AQOOGCC!QAQ AGCCCAGATA CCATITTGGC GTGAGA6CTG GTGOTTOQCA 120 

30 AGGCXX3CGG0 AQTGGQAAGC GTCCGCCATG TTCTGCGAAA AAGCCATGGA ACTQATCCGC 190 

GASCTGCATC GCGOGCCOGA AG3GCAACTG CCTGOCTTCA AOGAGGATGQ ACTCAGACAA 240 

' GTTCTGGAGG AQATGAAAGC TTTGTATGAA CAAAACCAGT CTGATGTGAA TGAAGOUUiG 300 

TGAGGTGGAC GAAGTGATTT GATACCAACT ATCAAATTTC OACACTSTTC TCTOTTAAGA 360 

AATCGACGOT GCACTGTAGC ATACXTTGTAT GACOGCTTGC TTOQGATCRO A0CACTCAGA 420 

35 TGGGAATAT6 GTAGCX3TCTT GCCAAATGCA TTAOGATTTC ACATGGCTQC TGAAGAAGTC 480 

CGGTGTCTAA AAGACTATGG AGAATTTQAA GTTQATOATG GCACTTCAGT CCTATTAAAA 540 

AAAAATAGCC AGCACTTTTT ACCTOSATGG AAATOTGAGC AGCTGATCAG ACAAGGAGTC 600 

CTGGAGCACA TCCTGTCATG ACCAIGGSCC GAGSCACTTC CAOGCTTCAC TCAA CTCaJlG €60 

^ GACTCCTCTG TACTCACTCT CTCXACOICT CCCTTCACCT CCCTCTTTGA TTTTAGAAGC 720 

40 TATAGACATT GTTTAAGATA ACTAAGAATA CTTGGCTAAG AAGTATAATT TGCTAACTAT 780 

TAAGGACTTT CTTTTTTTAA TGTTQTACAC TATTCTTCCT ACTCTTTTTT GGTTTTGGTT 840 

TTGTTTTGTA GAGACTQTCT CACTATQTTG OCCAAGCTGG TCTCAAACTC CTGGOCTCAA 900 

GCAGTCCTCC CACCTTAGCT TCTCAAAGT6 TTGAGATCAC AGGOGTGAGC CACTGCACCC 960 

QQCCCCTACT CCTTTTTCTA ATAAGCTGTA TCTGTAATCA CAGCATTCCT ACAGTTGTTA 1020 

45 CAGTGTGTTT TTTAAATGAA AGTAAACATG GTTACATTTG AATCTCTTAA ATAAGCAOTC 1080 

ACTTGQCTGG ACAGGAAGAA GGTAGATCCT GTGTGTCTTG TTTTCTGGTC ATGTGTATTG 1140 

TACAA6CTAG AGAGCTGAAT TTCTGAGATA CACATTTTCA AATCACATGC AAGTGAAGAT 1200 

QATGGTCTOr AGAAATTTTC AGTATATATA ATOTTTAATO ACATACXAAT TTATCATCTG 1260 

GCTATTTGGG AAGGAAGGAC ACACATQGAT TTTGCACATT TCCAOCATGO TGGCTGGTGT 1320 

50 GGCTTGTCGC TATGGGGTGA TCACCAGTAT CACCACTTTG GAAGGGGACA GTGAAATTGG 1380 

GGCTAGAGAA GGAACTTTGT ACAGTTTTCC CTGAGATTCA GATTGACTGA AAAGTCACAT 1440 

GAA6AQTT6A TTOTCTTTTA ATGGTAT6TT TTAAACAGCT 6ACATTTTAA ATTTTGATGA 1500 

AATCXaOTIT ATTOSTTTOT TCTTTTATGC TTTGGGTGTT GCATOCQAGA AATCT TTTCC 1560 

CATCXrCAAGA TCACAATTTT TTTTCCTTTT TACTTCTA6A AGT GTTATAA TTTTAAGC TT 1620 

55 TATACTTTGG TCTATGACCC GTTTTTTTTT TTQTTTTGTT TTOTTTTTTC GTTTGTTTCT 1680 

TTCTTTTGAG ATGGAGTCTT GTTCTGTCAC CCAGGCTGGG GTGCAGTGGC GTGATCTTGG 1740 

CTCACTGCAA TCTCTATCOC CTGGGTTCAA GTGATTCTCT TGTCTCAGCC TCXXaAGTAG 1800 

CTGG6ATTAC AG6CACAG6C 08CCACX3CCT GGCTAATTTT TGTATTTTTA GTAGAGACAG 1860 

. AGTTTTACCA TGTTG6CCAG GCTGGTTTCA AACTCCTGAC CTCAAQTQAC CCACCTTGGC 1920 

60 CTCCCAAAGT TTTGGGATTA CAAGTGTGGG CCAC0GC6GC CAGCCTAT6A TCCATTTTGA 1980 

ATGAATTTTT TATATGGTGC AAGGTGTCAA TCX3VCCTTCA CTTTTTCTTO GGAATATAGA 2040 

TATCCA6CT6 TTTCACTACC ATTTTTTGAA AGGACTGCCX: TTTGCTCTAT CACCTTTGCA 2100 

TTTTTGTTAA AAAflTAGTTO TCAATGTATA TaTGaOTTTA TTTCAQGACT CTGTTTTGTT 2160 

OCATTQACCT GTTTTTCTCT CCTOAATGCC AATACCATAT TTGTAT GTAG TGTATGTAAT 2220 

65 TTTCTAATAA TTCTTGAAAC AQATAQTATT AATGTGTCAT ATTTTTGCTG TTGTTTGTAT 2280 

TTTTTGTAGA QATGGGGTTT CACOGTGTTG GCCAGGCTGT GTTGAACTCC TGAGCTAAAO 2340 

CAATACACTT GCCTOGTOCT CCCCATGT6C TGGGATTACA GGCGTGAGCC TTGQTGCT GG 2400 

CCCAGTGTAC CAGATTTCTT TTTQAGATTT 6TTTTGGCTA TGTTAAGTCC TTTGCTTTTG 2460 

ATGTGAAATT TQG6AACA6G CAGGGT6TG6 TGGCTTATGC CTGTAATCCT AGAACTTTGG 2520 

70 GAGGCCTAGA TGGGTGGATC ACTTGAGCTC AGGAGTTCCA GACCAGCCX3Q GGCCTATGGC 2580 

AAAACTCCGT CTCTACAAAA AATAGAAAAA ATTAGCCAGG TQTGGTGGTO CATGCCTGTA 2G40 

GTCACAGTTA CACGGCAGGC TGAGGTGGGA GGATCACTTG AACCCCAOAG QTCAAGACTG 2700 

CAGTGAGCTG AGATCACACC ACTGTACTCC AQCCTGGGTG ACAAAGTGAG ACTCTATCTC 2760 

- AAAAAGAAAT TAGGATCAAT TTGTCAATTT CTACAACAAC AACAACAAAA ACCCCTGTTG 2820 

75 6GCACCTTGA TTGAQATTGC ATTGAATTTA TATAAAACTQ TTGGGAGAAT TGA CATCTT A 2880 

ATAATATTGA GTCTTCTGGC CTATAAACAA GGTCTGTCTT CCTAGGTATT AATGTTTTGT 2940 

CTTCTATTTC TCTTAATAAT CTTTTGTAGT TTTCAGTGTA CAGGTCTACC ATGTCAGCAT 3000 

TTCATAGTTT TOATGCTAAA TGGTATTTTA AAATTTCAAA TTCTAACCAC TTGTTGCTAG 3060 

TAAATAGAAA TACAATTGAT GTTGAACTTG TATCCTTCAO CCTTGCTAAA CT GTGAG TTC 3120 

80 TCATGGTGTT TTTGTAAATT ACATCAACAG TCATGTGTTC TATGAATAAA GAGTTTTACT 3180 
OCTTC 



85 



Seq ID NO: 158 Protein sequence i 
Protein Accession #t Eos 8eq^ence 

1 11 21 31 41 51 

1 I I I i I 
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MFCEKAMELI RBLBRAPEGQ LPAFNEDGLR QVLEEHKALY EQNQSDVNBA KSGGRSDLIP 60 
TIKFRHCSLL RNSRCTVAYL YDRIiLRISAL RWEYGSVIiFN ALRFBMAAEE VROiRDyCKP 120 
EVDD6TSVLL KKKSQHFLPR WKCEQI.XRQG VLEKZIiS 

Seg ID NO: 159 DMA sequence 

Nucleic Acid Accession #: Eos sequence 

Coding eequencet 149-229 

1 11 21 31 41 51 

i I I i I i 

GTTCGGCGCC AAA6C6CGGA GCGGAGGCCG AGGCGAGAGC CTGGCGCTGT AGGACTAGAA 60 
OQAAAGGAGT QA6GCX3CCX3A GASCXXAGAT ACCATTTTGG GSTGAGAGCT GaTQGTTGGC 120 
AAG6C06CGG GAGTGGGAAS CX3TG06GCAT 6TTCTGC6AA AAAGCCATGG AACTGATCC6 IBO 
CGAGCTQCAT 06CGCGCCCG AAGGGCAACT GCCTGCCTTC AACAATTAGC TOGGTGTQQT 240 
GGCACACACC TGTAGTCCCA GCAACTTAGG AGGCTGAA6T GAGAOQATTG CATGGCTCCA 300 
GGAAGTTGAA ACTGCAGTGA ACTGTCGTCA C6CTATTACA CTCCAGCCTG GGTGACAGAC 360 
TGft ATOO CTG TCTCAAAAAG GAAAA6GAG6 ATGOACTCAO ACAA6TTCT6 GA66AGAT6A 420 
AAOCTTTOTA TGAACAAAAC CAGTCTGATG TGTTCTCrQT TAAGAAATCG AC6CTGCACT 480 
GTAGCATACC TGTATGACCG CTTGCTTOGG ATCAGAGCAC TCAGATOG 

Seq ID NO: 160 Protein sequence: 
Probein Accession #t Eos sequence 

1 11 21 31 41 51 

t I t I I I 

ATQTTCT6CG AAAAAGCCAT G6AACTGATC CGC6AGCTGC ATOGCGCGCC CGAAGGGCAA 60 
CTGCCTGCCT TCAACAATTA G 

Seg ID NO: 161 DNA Sequence 
Nucleic Acid Accession #t U10694 
Coding sequence t 1333-2280 

1 11 21 31 41 51 

I I I I I I 

GOATCSCQGCC GQATCTCAOG aAOdTOAGSA CTTTGTTCTC AGAGGGTGTG TGTOaACAAA 
ACAGG6A6GC CCTGT6TTG6 ACAGAGACAis TG6TCGCAG6 ATTGGAGAGC AGTCGAGGTa 
AGGAACCTAA GGQAGGATOQ AGGGTACCTC CAGGCCAGAG AAACTCTCA6 ATCAAQAGAG 
TTTGCCCTGC CCCTACTGTC ACCCCAGAGA GCCCGGGCAG G6CTGTCTGC TQAGGTCCCT 
CCTTTATCCT GG6ATCACT6 GT6T0GGGGA GGGCTGGCCT TGGTCTGAGO GGGCTGCACT 
CAOSTCAGCA GAGGGAGGGT CCXAGGCrCT GGCAGGAGTC CAGGTGCAGA CTGAGGGGAC 
CCCACTCACC AAACACAGAO GACCTAGCCC CACCCTGCCC CTTGTOTCAS CTGAGGOAAO 
CCX5CTGGGTG GATGGACTCC CCTCACTTCC TCTTCAGGTQ TCT0CTGGA6 ATAGGGCCTC 
AGGTCAACAG AGGGAGGGTT CCAGACCCTG CAOGCATCAA GATGAGGACC AGGCAGTATC 
CTCACCCCAG GACACATG6A CCCCATTGAA TTTAOACATC TCTTACTGTA CTTCXXSAGGA 
AACCCTGGGC AGGTGT6GQC AQATGTTGGT TGGGGCATGT CCTTCTOTTC CATATCAGGG 
ATGTGAGCTC CIGATCTGAG AGACTCTC31G 6CAAGTAGAG 6AGTAGAGTC CA6TGCCTGC 
CAGGAGAAAG GTCAGGGCCC TGAGTGAGCG CAGAGGGGAC CATCCACCCC AAAAGTGTCT 
AGAACTCAAG AGTGTCCAGC CXX3CCCTCTT 6ACAGCACTG AGGGACCGGG GCTCTGCCTG 
CAGTCTGCAG CCTAAGGGCC CCTCGATTCC TCTTCCAGGA GCTCCAQGAA GCAGGCAGGC 
CTTGGTCTGA OACAGTGTCC TCAGGTCGCA GAGCAGAG6A GACCCAGGCA GTGTCAGCAO 
TGAAQGTGAA GTGTTCACCC TGAATGTGC31 CC»AGGGCCC CACCTGCCXTC AGCACACATG 
GOACCCCATA GCACCTGGGC CCATTCCCCC TACTGTCACT CATAGAGCCT TGATCTCTGC 
AGQCTASCIG CAOSCTGAGr AGCCCTCTCA CTTCCTCXCT CAGGTTCTC3Q GGACAGQCTA 
ACC3U3GAGGA C3VGGAGCCCC AAGAG6CCX!C AOAGGAGCAC TGAOGAAGAC GTGTAAGTCA 
GCCTTTGTTA GAACCTCCAA GGTTCG6TTC TCAGCTQAAG TCTCTCACAC ACTCCCTCTC 
TCCX:CAGGCC TGTGGGTCTC CATCGCCCAG CTCCTGCCCA OGCTCCTGAC TGCTGCCCTG 
ACCAGAGTCA TCAT6TCTCT CGA6CAGAG6 A6TCCGCACT GCAAGCCTGA T6AAGACCTT 
GAAGOCCAAG GAGAGGACTT GQGCCTGATG GGT6CACAG6 AACCCACAGG OGAOGAQQAG 
GAGACTACCT CCTCCTCTGA CA6CAAGGAG GA6GAG6TGT CT6CTGCTGG GTCATCAAGT 
CCTCCCCAGA 6T0CTCAGGG AGGCGCTTCC TCCTCCATTT COSTCTACTA CACTTTATGQ 
AGCCAATTCG ATQAGGGCTC CAGCA6TCAA GAAGAGGAAG AGCCAAGCTC CTCGGTCOAC 
CCAGCTCAGC TGGAGTTCAT GTTCCAAGAA 6CACIGAAAT TGAAGGTGGC T6AGTTG6TT 
CATTTCX:T6C TCCACAAATA TOGAGTCAAG GAGC0G6TCA CAAAG6CAGA AATGCTGGAG 
A8GQTCATC3V AAAATTACAA GCGCTACTTT CCTGTGATCT TCGOCAAAGC CTCCGAGTTC 
ATGCAGGTQA TCTTTGGCAC TGATGTGAAG GAGGTGQACC CC3GCCGGCCA CTCCTACATC 
CTTGTCACTG CTCTTGGCCT CTOGTGCGAT AGCATGCTGG GTGATGGTCA TAGCATGCCC 
AAGGCaSCCC TCCTGATCAT TGTCCTGGGT GTGATCCTAA CCAAAGACAA CTGCGCCCCT 
GAAGAGGTTA TCTG6GAAGC GTT6AGT0I6 ATGG6GGTGT ATQTTGGGAA GQAGCACArTG 
TTCIAOQGGG AGCCCAGGAA GCT6CTCACC CAA6ATTGGG TGCAG6AAAA CTACCT6GAG 
TACOGGCAGG TGCCCGGCAG TGATCCTGCG CACTACOAGT TCCTGTGGGG TTCCAAGGCC 
CACGCTGAAA CCAGCTATGA GAAGQTCATA AATTATTTGG TCATGCTCAA TGCAAGA6AG 
CCCATCTGCT ACCCATCCCT TTATGAAOAG GTTTTGGGAG AGGAGCAAGA GGGAGTCR3A 
GCACCAGCCG CAGCCGGGQC CAAAGTTTGT G6GGTCAGGG CCCCATCCAG CAGCTOCCCT 
GCCCCATGTG ACATGAGGCC GATTCTTOSC TCTGTGTTTQ AAGAOAGCSVA TCAGTGTTCT 
CAOTGGCAGT GGGTGGAAGT GAGCACACTG TATGTCATCT CTGQGTTCCT TGTCTATTGG 
GTQATTTGGA GATTTATCCT TGCXCCCTTT TGGAATTGTT CSUUITGTTCT TTTAATGGTC 
AGTTTAATGA ACTTCACCAT OQAAGTTAAT GAATQACAGT AGTCACACAT ATTGCTGTTT 
ATGTTATTTA GGAGTAAGAT TCTTGCTTTT GAGTCACATG GGQAAATCCC TGTTATTTTG 
TGAATTGGGA CAAGATAACA TAGC3W1AGGA ATTAATAATT TTTTTGAAAC TTGAACTTAG 
CAGCAAAATA 6AQCTCATAA AGAAATAGTG AAATGAAAAT GTAGTTAATT CTT6CCTTAT 
ACCTCTTTCT CTCTCCTGTA AAATTAAAAC ATATACATGT AlACCTGGAT TTOCTTGGCT 
TCTTTG AGCA TGrAAGAGAA ATAAAAATTQ AAAGAATAAT rrnXJCTtflT CACTGGCTCA 
TTTTTTCTTC AGACA03CAC TGAACATCT6 TTATTCGOAA CACCCTGGGT T 



Seq ZD NO: 162 Protein sequence: 
Protein Accession ft: AAA68877.1 
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wo 02/086443 



1 11 21 31 41 51 

I I I I .1 I 

MSLEQRSFHC KFDEDLEAQG EDLGLMGAQE PTGEEEETTS SSDSKEEEVS AM3SS8PPQS 60 

PQ06ASSSIS VYYTLHSQFD E6SSSQEEEB PSSSVDPAQL EFMFQEAIiKIi KVAELVHFLIj 120 

HKYRVKEPVT KAEMIiESVIK NYKRYPPVIF GKASKPMQVI FGTDVKEVDP AGHSYILVTA 180 

LGLSCDSMLG DOHSMPKAAL LIIVL6VILT KDNCAPEBVI HEALSVHGVY VGKEHMFYGE 240 

PRKLLTQOWV QENYItEYRQV POSOPAHYEP LHQSRAHAET SYEKVINYLV mUKRBPldC 300 
PSLYEEVLQE EQE6V 

Seq ID NO: 163 DKA sequence 
Nucleic Acid Accession «t AF292100 
Coding sequence: 30-809 

1 11 21 31 41 51 

I I I i I I 

GGG66GGGAG A6GCCTGGAG GACACCAACA TGAACAAGTT GAAATCATCG CRGAAG GATA 60 

AAGTT06TCA GTTTATGATC TTCACACAAT CTAGTGAAAA AACAGCAGTA AGTTGTCTTT 120 

CTCAAAATGA CTGGAAGTTA GATGTTGCAA CAOATAATTT TTTCCAAAAT CCTGAACTTT 180 

ATATAOSAGA GAGTOTAAAA GGATCATTGG ACAGGAAGAA GTTAQAACAG CTGTACAATA 240 

GATACAAAGA CCCTCAAOAT GAGAATAAAA TT6GAATA6A TGGCATACAG CAGTTCTGTG 300 

ATGACCTGGC ACTGGATCCA GCCAGCATTA 6TGTGTTGAT TATTGCGTGG AAGTTCAGAG 360 

CAGCAACACA GTGOGAGTTC TCCAAACAGG AGTTCATGGA TGGCATGAC3V GAATTAGGAT 420 

GTGACAGCAT AGAACAACTA AAGGCCCAQA TACCOU^T GGAACAAGAA rTGAAAGAAC 480 

CAGGAC6ATT TAAGGATTTT TAOCAGTTTA CTTTTAATTT TSCAAAGAAT CCAGGACAAA 540 

AAGQATTAOA TCTAGAAATG GCCATTGCCT ACTGGAACTT A6T6CTTAAT GGAAGATTTA 600 

AATTCTTAGA CTTATGGAAT AAATTTTTGT TGOAACATCA TAAAOQATCA ATACCAAAAG 660 

ACACTTGGAA TCTTCTTTTA GACTTCAGTA CGATGATTGC AGATGACATG TCTAATTATG 720 

ATGAAGAAGG AGCATGGCCT GTTCTTATTG ATGACTTT6T GGAATTTGCA CQCCCTCAAA 780 

TTOCTOGQAC AAAAAGTACA ACAOrGTAGC ACTAAABOAA GCTTTTASAA TGTACATAGT 840 

CTGTACaATA AATACAACAG AAAATTGCAC AGTCAATTTC TGCTGGCTG6 ACT6AACTGA 900 

AGATCAATCC TCACAATTCA GACTGAGGGT TGA6ACAAAA CTTTAAGGAT ACATCTTGGA 960 

CCATATOSTA TTTCATTCTT CTTAATGGTGG TTTGGGCTTG TCTTCTAGTC TGGGCOGCTC 1020 

TAAACATTTA TAATTCCAAC ATTCTGGATT TCATCTTATA TC TGTGGA CC AT GCTAG TTT 10 BO 

ATTCTOCCAT AAGTCTTASA AGCTTTATGO TGATIATTTT GAGGTTTTCA TTCTCGCATA 1140 ' 

AAGCACAATG CTGTCTTCAT CAGAAAACA8 TTGGCATAAG AATTAAACAT ATGAACATCA 1200 

CAAAACAATT TATAAAAACT TCTTAAATAT ACGCTTTG6G CTAGTTGCAA AOACTATGCT 1260 

AATAGCACTT CCAGTGAGAG TGATATATTT AAGTGTACTG GATCTGGAAT GGTGTTTTOG 1320 

TTTGGGGGGA ATTTTTTTTT TTTCCTGGCA AATCACATAT GTTGTTGATG TOAGTATCTG 1380 

ATGAAAAAAC AATGTCAGAA TAACC36ACAT QAAAATTTTT TAGGATAACT TGGTGOCTAC 1440 

CTGAAAAATG TATTGTGTTT TA6ACTCTTG ATTTCAAAAG GTTCCACAiGA ACTAGTCT6C 1500 

GCTTACCTTA CCCATGTTTA TATATAGCTO TCCTACAG6G AGCTTTTATT TA6AAAATGT 1560 

CTGCATAATG TTAGATrCTT CTCCTGTCTA CATTATGCAC TACATAATTG GACTTCATTA 1620 

TGCTTTTGAA ATGCTTATCT GCCTGTCACA TAAGTTAAAC TATTTAATTT GTTTTGAATG 1680 

TTTTGGATTQ CTACACAATA CAATATTCTA AATTTW3GCA TGAGGGrTTT TTTOTTTTAT 1740 

TTTTACTTTT TTTTT G TCAT TGCACTATG6 AACRCAAATO AAATTCTCTT AATTTATAAO 1800 

AAGATAGTAG GAGTTAAATT TTGAAAATGG TTGTGATGAG CCACGAAATT CAATCTTTAT 1860 

AATATAGGTA CTGCTCTTTC AGACAAACAG TCCATTTTTA ATGACTTCTT ATTTTGTTGA 1920 

AATTACTTTA ACTGCTAATC ACTGTGGTTG CCAAATATTT ACTTCAGAAG CAAAGATTTT 1980 

CAAACMVGCA TACAOSATGC AAAATACCAG TCTG6CTTCT AOTCTATTTA CTOTTTTGTT 2040 

TCACTCAtaT TA6CTCAGTT TTCTCATCM AGCAGAATGC TATCTTOOCrP GTGXGTGTGT 2100 

G TGTGTGTGT GTGTGTGTOT GTATGTGTGT ATATATATAT ATATATATAT ATATATATTT 2160 

' XTITlTrrn ' TTTTTTTTAA ATIACAAMQ CCATQAGCTG CTTTTATGCT GAAAATGGTC 2220 

ATTTCCCTGT TCACTTACTG ACATQTGAAQ AAGQGTTTCT TGCTTTCTTA AACATTTCCG 2280 

TAAGGCAGGC TAGAAATQTA ATACTTCAAA TOTTTOATGA TTAIGGTCTT TTGATAGGAA 2340 

TAGATTCTGC TTOGGATATA TATCX3W30CA CTCTCTAAGG TCTA6GGTTG ATATTAACAA 2400 

AGGAATGTAC TTAGAATAGC AGTACATTTT ATGCAAATAT GGAAATTATT TTAAGAAACA 2460 

ATGACATATC AAAACTGCTT TTTACATGAT TTTGAAATAG ACTAGAAAGC TTTCCCTATA 2520 

GACATATTAA TATTCCAATC ATAACTTTAA TTCAAGAATG CAGTTTTACC AAAAGAAAAA 2580 

TTTGAAAATT TCTATTCAGG CTACTGGAAT TGGTTATTAA AAGAAAAAGG AAAAAGAAGA 2640 

ATCTTGCTGC TTTC3U3TATT TCCTGATTTT TTTGTAAATA TAAAGAGQAA CTTCAATTAT 2700 

QAAAAATTTT TAAAAGATAT ATATATCTAT ATATCTATAT ATATGTACTG TTTTOTTTCC 2760 

TGTCTTQAAG ATTTTGAGTT ATGGTTATTG GTTTCAQATT GATTAATTCA CATATGCTGT 2820 

GTTTTCTTTA AAAGTCATAT GGGTTCGTGG CCTAATQCCT TGGATTTTAC ATATTTTTCT 2880 

TTTTAAATGC AAAACCTTTT CAACAAAATA GTGTTTOTCA TCAGGTTGGT ACTAAACATT 2940 

TATAATTACT GTGTAATTAT AAACAAAAAT ACATAAAGCT TTGAATATAA TTATGTAGCA 3000 

TAAAAGTTAA GGTTGTTCAC TATQATOGCA TCTTASAATT AAACAAAACT TTTACTAGGG 3060 

CTGAAAAGAG AAGACTGATT TAATGT6GT0 TGATTATTCT GAAGATAAAT GTCTGGCTAC 3120 

AGGGAATATT TTGTACTAAA AAATQATTAC ACATATGGCT GTGTGTGTTT 6AGTCT6TGT 3180 

CTGTGAGAGA GCCAGAGAGA GTGAQAGAGA TTGACAGAGA AAGGGAGAGA CACACACAOG 3240 

CCCXTTTGAAT TQCTTTAACT CCTAAGTGTT TCAGTCCTCA TTC0C3GTAAA CTCCCCATGC 3300 

TGATTCTTTG TTTTAAACTG AACCATAGGT ACAGTTTCCT TTTTGCCAAA T6TCAAAACA 3360 

GGTACAAATT TTAAAATGTA ATGCTTTTTA AATA6AAAAA TGTATAAAAT TAGAAGTGCC 3420 

CACATATAAA AAATACTTGA GAXGAAGATT ATCTTTAGTO AATATCATCT OCATATCTCT 3480 

GTAAGTTCAA TT6TGTTTCT TACAGTCCCT GTCATATTAC CAACAOAGGC AATAAAAGCT 3540 
GCAGTGAAAT TG 

Seq ID NO I 164 Protein sequence: 
Protein Accession tf: AAG00606 

1 11 21 31 41 51 

I 1 I i I I 

MNKLKSSQKD KVRQFMIFTQ SSSKTAVSO. SQNDWKLDVA TDNFFQNPEL YIRESVKGSL 60 

DRKKLEQLYN RYKDPQDENK IGIDGIQQFC DDLAU)PASI SVLIIAWKFR AATQCEFSKQ 120 

BFMDGhrrELG CDSIEQLKAQ IPKMEQEliKB PGRPXDFYQF TFNFAICMPGQ XGLDLEMAIA 180 

YMNLVLNGRF KPLDLWNKFL LEBHKRSIPK DTHNLLLDFS TNIADDNSNy DEEQAHFVLI 240 
ODFVEFARPQ lAGTKSTTV 
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9eq ID IfOt 16S DMA sequence 
Nucleic Acid Accession #: AF2S6215 
Coding sequence: 220-2028 

1 11 21 31 41 51 

I I I I I i 

CTCCAGTCCG CATGCTCAGT AGCTGCTGCC QGCCGGGCTG CGGGGCGGCG TCCGCTGCGC 60 

GCXn*ACGG6C TGGG6TGGCG GCCGCCX3CG6 CACC066CAG GGCCGGCCAG TCCCC6CTTC 120 

CXTTGCTCICAG AGCC6C0QCC TG6GCOGGC3Q CAGGGCG6GC CCGG6GCTCC TCCAT6CTGC 180 

CAGCC6CCG6 6CTG06GAGC CGACXAAGTG GCTCCTGCGA TGGCGGCGGA AGAGGA6GCT 240 

GCGGCGGGA6 GTAAAQTGTT GAGAGAGOAG AACCAGTGCA TTGCTCCTGT GGTTTCXAGC 300 

CGCGTGAGTC CAGGGACAAG ACCAACAGCT ATGQGGTCTT TCAGCTCACA CATGACAGAG 360 

TTTCCACGAA AAC6CAAAGG AAGTGATTCA GACCCATCCC AAGTGGAAGA TGGTGAACAC 420 

CAAGTTAAAA TQAAGGCCTT CAGAGAAGCT CAXAGCXAAA CTQAAAAGOG OAGOAGAGAT 480 

AAAATGAATA ACCTGATTGA AGAACTGTCT GCAATGATCC CTCA6TGCAA CCCCATGGOG 540 

CGTAAACTGG ACAAACTTAC AGTTTTAAGA ATGGCTGTTC AACACTTGAG ATCTTTAAAA 600 

GGCTTGACAA ATTCTTATGT GGGAAGTAAT TATAGACCAT CATTTCTTCA GGATAATQAG 660 

CTCAGACATT TAATCCTTAA GACTGCAGAA GGCTTCTTAT TTGTGGTTGG ATGTGAAAGA 720 

GGAAAAATTC TCrrCGTTTC TAAGTCAGTC TCCAAAATAC TTAATTATOA TCAGGCTAGT 780 

TTGACT6GAC AAAGCTTATT TGACTTCTTA CSATCCAAAAG AT6TTGCCAA A6TAAAGGAA 840 

CAACTTTCTT CTTTTGATAT TTCACCAAGA GAAAAGCTAA TAGATGCCAA 7ACTGGTTTG 900 

CAAGTTCACA GTAATCTCCA COCTGQAAGG ACACGTGTGT ATTCTGGCTC AAGACGATCT 960 

TTTTTCT6TC GGATAAA6AG TTQTAAAATC TCTGTCAAA6 AAGAGCATGG AT6CTTACCC 1020 

AACTCAAAOA AOAAAGAGCA CAOAAAATTC TATACTATCC ATT0CACTG6 TTACTTQAGA 1080 

AaCTGQCCTC CAAATATT6T TGGAATGGAA GAAGAAAGQA ACA6TAAGAA AGACAACAGT 1140 

AATTTTACCT GCCTTGTGGC CATTGGAAGA TTACAGCCAT ATATTGTTCC ACAGAACAGT 1200 

GGAOAGATTA ATGTGAAACC AACTGAATTT ATAACCCGGT TTGCAGTGAA TGGAAAATTT 1260 

GTCTATGTA6 ATCAAAGGGC AACA6C6ATT TTAGGATATC TGCCTCAGGA ACTTTTGG6A 1320 

ACTTCTTGTT AraAKTATTT TCATCAAGAT GACCACAATA ATTTOACTOA CAAOCACAAA 1280 

QCAOTTCTAC ASAGTAAfiGA GAAAATACTT ACAGATTCCT ACAAATTCAG AQCAAAAGAT 1440 

GGCTCTTTTG TAACTTTAAA AAGCCAATGG TTTAGTTTCA CAAATCCTTG GACAAAAQAA 1500 

CTGGAATATA TTGTATCTGT CAACACTTTA GTTTT6GGAC ATAGTGAGCC TGGAGAA6CA 1560 

TCATTTTTAC CTTGTAGCTC TCAATCATCA 6AAGAATCCT CTAGACA6TC CTGTATGAGT 1620 

GTACCTGGAA TGTCTACTGG AACAGTACTT GGT6CTGGTA GTATTG6AAC AGATATTGCA 1680 

AATGAAATTC TGGATTTACA GAGGTTACAO TCTTCTTCAT ACCTTGATGA TTCGAGTCCA 1740 

ACAGGTTTAA TGAAAGATAC TCATACTGTA AACTGCAGGA GTATGTCAAA TAAGGAGTTG 1800 

TTTCCACGAA GTCCTTCTGA AATGGGQGAG CTAGAGGCTA CCAGGCAAAA CCAGAGTACT 1860 

GTTGCTGTCC ACAGCCATGA GCCACTCCTC AGTGATGGTG CACAGTT66A TTTC6ATGCC 1920 

CTATGT6ACA ATGATGACAC A6CCATGGCT GCATTTATGA ATTACTTAGA AGCAGAGGGG 1980 

GGCCTGGQAG ACCCTGGGGA CTTCAGTGAC ATCCAGTGGA CCCTCTAGCC TTTQATTTTT 2040 

AACTCCAAAA ATGAGAAACA TTTTAAAGCA TTATTTAC6A AAAAACTGTC TCAACTATTC 2100 

TTAAGTACTG TATTGATATT GTTTGTATCT TTTATTAATG TTCTACCACT TSTCPCSPS3Kt 2160 

TTGCATCTTC CTOTCACAGG GATGTGGGGA AATAOGTTTT CCTCCCAAGA GAACCAAGTT 2220 

TATTATAGAC TCCTTTATTC AGTGAAATGG CTTATAATCC ACTAiSTTQCC ATATTTTTGC 2280 

TAAAATATTT CTAACCAAGA ATACTACTTA CATATTGTTT TGGCTTTGTT TTATTTTTQA 2340 

TGCA&TTTTT TTTAGTT6A6 GTAATGTAAT ATATTCATQT TTTCCTTXGT GTCTAAGATT 2400 

GATTTATAAT A6TAGGTTTG TATAATTTOQ AACATTTTCC ATGCCTT806 AATTTCCTTA 2460 

ATTGAGGATA GGGCTTACAC ACTTTAAGAA AACAOTGAGT ACTT6AACAT TTAAAGGGAC 2520 

AGTGCAATTT ATAGTCATAA TCACATTGAA TACTGTATTT GATCTTTGGA GACTTAGGCA 2580 

AGCACAGAGC TGGGATATTT ATGCTCAGTT GAGCACTTTA A6ATGAATTT TAAGTGAGAT 2640 

GATTTCTTGC TTAAAACTCA GAAAGTCAAA AGA6TTTCAG CTTTCCTTAC AQAAAAGGAA 2700 

0GATCTT6G8 CCCTAOATCT TGGGGATTAA OCTCTGCATA TAAGATTTAC TCTTAATAGO 2760 

CCAGACGTGG T6CTCACGCC TGTAATCCCA GTACTTTGG6 AGGCTGAGAC GGGCAGATCA 2820 

CTTGAGGTCA GGAGTTCAA6 ACCAGCCTGG CCAATATGGT GAAACCCC6T TTCTACTAAA 2880 

AATACAAAAA AAATTACCCA GGCACTCACT CTTGAGGTAA CTAACCAACT CCCACGATAA 2940 

TGACA6TCCA TTCATGAGCG CAAAGGCCTC ATGACCTAAT GGCACACACC TGTAATCCCA 3000 

ACTGCITGGG AGGCTQAGGC GAGAGGATTO CTTQAACCTQ QGAGGGAOAG GTTGCAGTGA 3060 

GCOGAGATOS CACCACTGCA CTCCA8TCT6 GGCAACAOAG TC»6ACTTCA TCTCAAAAAA 3120 

AGTAAAAAAA AAGATTTAAT ATAATCACTQ AAQATCTCTA TTATAGATAG ATTAGGTTTT 3180 

TGACATTGGA AACATACTTA GGGATAGATT TGTCCTAAAG GAAAAAAGTA GGCCCGGGCA 3240 

GATTAAATGT CTTGTGTAAA GTCACACATT AAATTCA6TC ACACATTAAA TTCATAGAGT 3300 

TTTAAATQTT TAATQTATAT AAACCAOTTT CTTTATACAC ATTTGGGAAA ACATTOGTCT 3360 

CACAGATTAA ATQATTAACT AACTGACCCA GGAACTAGTT GTA6CTTTCT AAOTAATTAO 3420 

GCAATTACA6 TTATT6CCT6 TAACCAAAGG TAATAAAACA AAATGACAAG TACATGTTTA 3480 

AAATTATGAO GCAATGA6AA ATAATTTAAA AACCAATTTT CTAQTTATAA TTTAAAATTT 3540 

GGAGAGCATT TTTAACAGTA ATTAATCCAG AGGTGGCTCA AATTGAGTAT AAGAATTAAG 3600 

ATTATTTAAA ATACTGCATG TCTACCTTCT OGGQGATCAT ACTTTATAAC ACTTTCTGCT 3660 

TCAGTAGCTC TTCATAGCTT GCCAAGTATG CTCCCATATT TTCTCTCTOG TGCCTCGCAA 3720 

ATGAAAGTCA GATAGGCT6G GAACTCATGG GGCAQCCCTC AGACTTCAAT GTGGGCTTCA 3780 

AATCCAGTTT CCTGTTCTAT ATGGT6CTAC ATCTTTCCAG AAAATTTCCC TCAGAGCCCC 3840 

TCGCCAAAAC AAAGCATTAT TTTGACCCTG CATGCTATTT CTTTAGCTGT AGGTGATAQA 3900 

TTAGAACTTC TGTCA6ACAT GTTAATGACA AACATACCAA CAGACAATAA CCAAAGCAAA 3960 

TGTTTCCTTC AAGTGTGAAA TGTGCAGGGG CTCGTGGGCA AGGATGTATT GGCACACTGT 4020 

CCTCTTGAAC T6ATAGTGTC CCAGCAATCT TGGAGGTTG6 CACCAITCCT GGTCCGACAC 4080 

TTGAGGAOCT GAOAGACATC AG6TTTAGAA TGAGCCAAA6 AAATCCTACA AGATGG6GA6 4140 

AATTGGTGTG CAGCAGCCTA AGTGTTATAG TTAASTCTAA A6AAGTATGA AAGATCCCCT 4200 

GTGTTCTCTA AATTGAGCAG AGGGGCCTGC CTACCAATAT CACTTTTTAG GGGACTGAAC 4260 

CATTGCAGGT TAGACTTGGC TTCCAAAGAG TCTGCCTAAG CCA6QGGTGG CAGGGTAGGC 4320 

CATCATAGCT GGATGGCCTC AAAAGCAGAT GGGGGCAGAC TTGCCCTCGT GAT6CCAGGA 4380 

TTTGAGA6GC A6AGTTTCTA 6AGGGAGACC A6TGCT6CCT CTCACAGTG6 CAGTTTTTTC 4440 

TCTTTGCAAG AGGAGGG6CT GTTCAATTCC ATAGACCAOT GGGCAGATAG CCAGTTGAAT 4500 

ACTCTGTGCA TGGTTTGATC CTTTATTAQT TOGCTCTAAT ATTTTTCTGT AGATCCTTTT 4560 

GTCCTQGACT CAAAATCTAA TCCATGCATT GTATGATACC GTAGCTCTCC TAAGGTTTGT 4620 

GTTTCCTTCA AAAT6TTTTA GTTTTCTTCA ACTAAATTTG ATTTTTGCT6 TTAGAAGTGA 4680 

CATATTTTTA TGGTATACAC TATGTTCCTT TTTTCTACT6 C6ABTCAATT TTTTGAATTT 4740 

TCGIOAGAAA GAATATATCT ACAAATT6CA CGAAAGTATC ATAAAAACAG TACTCTAGAG 4800 
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CAGCGCTCTC CSUVTAOAAAT ATAATCTGAG CCACA TGTA T 
ACATTAAAGA A6TAAAAA0A TACAAGTAGA ACTAATTTTA 
CAAAATATCA TTTGAACATO TAATTAATAT AAAATTATTA 
TTGGTAATAC TAGTCTTCAA AATCTGGTAT GTATCTTACA 
GTACTAGCCA CATTGCAAGT GCTCAGTAGC CACATGTG6C 
A6CACAGTTC TAGGTTCCAC CCTAACACOC AAGTCCTGTG 
AGCIGQAAQT AAACATAQAG ATCAAACCTC CTTTTAAAAA 
6TTTAAAT6G CTT6CATGAG GTCATACAGC TAAATTCAGC 
CCaCGCACTC TTCCX3^CrCC ACTACATTAC TGTAGTOGTA 
TGTAOAGTAG GCCGGGCGCA GTG6CTCATG CCTGTAATCC 
GTGGGGGGAT CAOGAGGTCA GGAGATOGAG ACCATCCT66 
CTCTACTGAA AATACAAAGC AAAATTAGCC AGGTGTOGTO 
CIGCTCTGGA GGCTGAGGCA GAATG6C36TG AACCCAGGA8 
AGATC6CGCC ACTGCACCXX: AGCCTGGGOQ ACAGAGOGAG 
AAAAAAAAAA AAGAAAAGAA AAGAAAAOTC TAQAOAACAT 
GAAGTAGACC AAAGTTTATA CCATAAG6AT ATTTTTCCTT 
AATTATTTAT TGATCCTTGA ATCTGTAAGA TCAAATAACA 
AATTTAACCT TTTGAAAATA ATAAACTTTA AAATATCA6A 
CTT6GAATCA AGTGAAATGA GTTATATGGT CATCACTAAA 
CAAA6ACAAA CAGGAAAGTA CAGAATAGAG ACTTTTAGTA 
AAGTGTTTAT TTACAGTGTC ACX3ACAGAAA AGGATGTCTT 
GATCTCCGTA AAATCTGGGG CACAGGTACA AGAAATAGCC 
TGTTTAGTAG TQTCCAGTTT CAGATCATGC TGCCAAGAGG 
CATCACTGAG CCCTGQAATT GGAGACTCAT ACTTGCCCAG 
GGCCGACATC TATOATTAGC TAGAAGCCAT AAAGAAAAGC 
CCACTTTTCT GTTTTTGTAA TGCTTTCATT AQCAQATCTT 
GCCTATGAGA GGCATTTATG ATTTTTGTGC CTACAATAAG 
TGTTTTATGA GAAATGCTTT CCAAGGGAOQ TCTAGGAAGA 
GGCTTAGAQA GCTTTCCAGG T6TAGTGCCA ATAAAAACTG 
CAOCACGGAA CAT6CTTTCT GAACTCACTT GA GAiJTG TAT 
TATTCTTGAG TTTAGATTTQ TCTTTTATAC AATTTTTAGC 
CTCGTCTGTA TATTGGTATT TTTAAATTTT TGTGGTAAAT 
ATTTTATAAT TACTCATTTG TAflTTTTTTT .TTTTAATTTA 
GGTCCCTTAA AA 



Seq ID NO: 166 Pzotein sequence t 
Protein Acceesion #: AAG34652 



PCT/US02/12476 



AATTTTATTT 
ATGTTTTAAT 
ATGTGATATT 
TTGATAGCAC 
TAGTGGCTAC 
GATTAGAATC 
TGAGGAOGCT 
CTCAACAGGG 
ATTCTTAGGG 
CAGCACTTTG 
CCAACATGGT 
GCGGGGGCCT 
6CAGAGATGG 
ACTCCATCTC 
TATATTAA6T 
AAATACCATQ 
AGTCTCTATC 
TGTGTTATTA 
TTTAGAAATC 
AATAAATGGA 
TGTTGTCATA 
AATATTTAGT 
TATCTCCCCC 
CACAATGTTA 
TGCTAAfiTGG 
TTTTTTCCAA 
TCAGCCTGTC 
TCCTGACACA 
ACCTGGAAAG 
6GTGTATGTC 
TCTTTTCCA6 
AATGAAAAGA 
ATAAACrrCC 



TCTTCTAGCC 
TCAGTATATC 
TTACATTCTT 
ATCTCACTTT 
TGCACTGOAC 
CCASAATCAQ 
6AGGCACAGA 
TCTTCTGATT 
TTAAAAAAAG 
G6A680CX3AA 
GAAACCCCGT 
GTGGTCCCA6 
CAGTGAGCCA 



GGTrATTATT 
TTT6AAGAAC 
CATGTTACCA 
CAGGATGATA 
TATTGTGAAA 
ATTTAAAAGA 
GTCTTTGAGG 
TCCCAGACCA 
TCAGGTGGGT 
06GGCA6ACA 
CCACTAGGTQ 
GCTCCATGGG 
TGGTGTGAGT 
TAAGAACTTT 
AAAACCTGCC 
ACTTCTCATA 
TTCACTTGT6 
GTGAAATTAT 
TOCAAAAAGT 



MAAEBBAAAG 
QVEDGEBQVK 
QHLKSLR6LT 
UTYDQASLTQ 
ySGSRRSFFC 

LPQELLGTSC 
TNPWTKSLBY 
SXGTDZANEZ 
TRQKQSTVAV 
TL 



11 
1 

GKVIiRBEKQC 
MXAFRBABSQ 
NSYVGSKYRP 
QSLFDFLHPK 
RZK8CKISVK 
CLVAIGRLQP 
YEYFHODDRN 
ZVSVNTLVL6 
IiDLQRLQSSS 
HSEEPLLSDG 



21 



lAPWSSRVS 
TSKRRRDKMH 



OVAKVXSQLS 



yXVPQHSGEI 
NLTOKHXAVL 



31 
I 

PGTRPTAHGS 
NLIEELSAMI 
LIIiKTAEGFL 
SFDISPREKIi 
KKEBRKFYTI 
NVKPTBFITR 
Q8KEKILTDS 



41 



YU3!D8SPTGL 
AQUIFDALCD 



MKDTHTVNCR 
NDDTANAAFM 



PQCNFMARKL 
FWGCERGKI 
IDAKTGIiQVH 
HCTGYLRSWP 
PAVNGKFVYV 
YKFRAKDGSF 
SRQSCMSVPG 
SMSNXELPPP 
NYLBAEGGLG 



Seq ID NO: 167 UNA sequence 
lluclelc Acid Accession #t liH_014400 
Coding sequence: 86-1126 



1 

1 

GGTTACTCAT 
GACSGCCAAOG 
GATCT6GACT 
GTGCTACAGC 
GAAGTGCGCG 
CGGACAATTC 
OGGCCTGGAT 
CTGCAACGCC 
ATACCXX5CCC 
G6GTACATCG 
CTTOGACGGC 
CTGTGTCCAO 
7GGCTCCT6T 
CCCT06AATC 
CACATCTGTC 
GCCAGCGCCA 
GGAGCCCAGG 
TCCIGCAAAA 
ATT66CAG0C 
AAATTTCCCT 
CCCACOVCTG 
CTTCTGCTGC 
GGGTGTTCTA 
TCCTCTTGTG 
AG6AT0CTAA 
GGTGGGACAA 
ATOGGTTCCC 
CTTATGTCrO 



11 

1 

CCTGGGCTCA 
GAGCAGGACG 
GCAGGCTGGC 
TGOGTGCAGA 
CCGGGCGTGG 
TCGCTGGCAQ 
CTTCAOGGGC 
AAGCTCAACC 
AAOGGOGTGG 
CCOCCGOTCG 
AAOGTCACCT 
GATGAATTCT 
TGCCAOOGGT 
CCACCGCTTG 
ACCACTTCTA 
ACCAGTCAGA 
TTGACTGGAG 
GGGGGGCCCC 
CTTCTGTTGO 
CTCACCTACT 
GACTGGGCTG 
GCTGGTTTGC 

GCrrrTTGAG 

ATGTTAGGAC 
6CTTCCTACT 
TGGCTCCCCA 
CATATGTCTT 
TGTGTQATCA 



21 

I 

GGTAAGAGGG 
GAGCCATGGA 
TGCTGCTGCT 
AAGCAQATGA 
AOQTCTGCAC 
TGCSGGGTTG 
TTCTGGCGTT 
TCACCTCGCG 
AGTGCTACAa 
TGAGCTGCTA 
TGAG6GCA6C 
GCACTCGGGA 
CC06CTGTAA 
TCCGGCTGCC 
CCTCGGKCC 
CTCCGAGACA 
GOGCCGCTGG 
AGCAGCCCCA 
C0GT6GCTGC 
TCTCTGGCCC 
GCCCA6CCCC 
GGCTTTGGGA 
6ACA6CTCCT 
AGA6T6AGAG 
CACTTTCTOC 
CTCTAAGCAC 
CCTTACTAGA 
GTTTCTGGCA 



31 
I 

CCCGAGCTCG 
CCCCGCCAGG 
GCTGCTTOGC 
CGGATGCTCC 
CX3AGGCXX3TG 
CGGTTOGGGA 
CATCCAGCTG 
GGC6CTCGAC 
CTGTGTGG6C 
CAAOGCCAGC 
TAATGT6ACT 
TGGAGTAACA 
CTCTGACCTC 
CCCTOCAGAO 
AGTGAGACCC 
GGGAOTAGAA 
CCACCAGGAC 
TAATAAAGGC 
TGGTGTCCTA 
T66GTACCCC 
TGTTTTTCCA 
AATAAAATAC 
GTATCCTTCT 
AAGTCAGCTO 
TAGCCAGOCT 
TGCCTCCCCT 
CTGTGAGCTC 
CATAAATGCC 



41 

I 

GAGGCGGCAC 
AAAGCAGGT6 
GGAGGAGC6C 
CCm ACAAQA 
GGG6CG6TGG 
CTCCCCX3GCA 
CAGCAATGCG 
COGGCAGGTA 
CTGAGCCQGG 
GATCATGTX^T 
GTGTCCTTGC 
GGCCCAGGGT 
CGGAACAAGA 
OOCACGACTG 
ACATCCACCA 
CACGAGGCCT 
OGCAGCAATT 
TGTGTGGCTC 
CTGTGASCTT 
TCTTCTCATC 
ACATTCCCCA 
CGTTGTATAT 

catccttgtc 
tcag^ggaa 
gqactttgga 

ACTCCCCQCA 
CTOQAGGGCA 
TCAATAAAGA 



4860 
4920 
49B0 
5040 
5100 
5160 
5220 
5280 
5340 
5400 
5460 
5520 
5580 
5640 
5700 
5760 
5820 
5680 
5940 
6000 
6060 
6120 
6180 
6240 
6300 
6360 
6420 
6480 
6540 
6600 
6660 
6720 
6780 



51 
I 

KRKGSDSDFS 
DKLTVLRMAV 
LFVSKSVSKI 
8NLHAGRTRV 
PNIVGMEEER 
DQRATAILG7 
VTLKSQWFSF 
MSTGTVLGAG 



DPC3)FSDIQH 



51 

1 

ACCCAGGGGG 
CCCAGGCCAT 
AGGCCCTGGA 
TGAA6ACAGT 
AGACCATCCA 
AGAATGACOG 
CTCAGGATCG 
ATGAGAGTGC 
AGGG6TGCCA 
ACAAGGGCTG 
CTGTCCX3GGQ 
TCAOGCTCAG 
CCTACTTCTC 
TGGCCTCAAC 
CCAAACCCAT 
CCCGGGATGA 
CAGGGCAGTA 
CCACAGCTGG 
CTCCACCTG6 
ACTTCCTOTT 
GTATCCCCAG 
ATTCTGGCAG 
TCTCCX3CTT0 
GGTQAGAGAO 
GCX3TOG6GTG 
TCTTTGQGQA 
GG6ACCX3T6C 
TTTAATTACT 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
760 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
ISOO 
1560 
1620 
1680 
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TTGTATAGT6 AAAAAAAA 

Seq ID nOt 166 Protein sequence: 
Protein Accession #: HP_055215 

1 11 21 31 41 51 

I I i I I.I 

MDPARKAGAQ AMlWTAtSWLL UiLLRGGAQA LECVSCVQKA DDGCSPHKMK TVKCAP6VPV 
CTEAVQAVET IHGQFSLAVX GCGSGLPGKN DRGLDLHGLL AFIQLQQCAQ DRCNAKUJLT 
SRALDPACME SAYPPNGVEC YSCVGI»SREA CQGTSPPWS CYNASDHVYK GCFDGNVTliT 
AANVTVSLPV RGCVQDEFCT RIX3VTGPGFT LSGSCXrQGSR QiSDLRNKTY FSPRIPPIiVR 
LPPFEPTTVA STTSVTTSTS AFVRPTSTTK PMPAPTSQTP RQGVEHEASR DEEFRLTGGA 
AGUQDRSNSG QYPAK66PQQ PRNK6CVAPT AGI1AAZ4LIAV AASVU. 

Seq ID NO: 169 DtJA sequence 
Nucleic Acid Accession #t NM_006875 
Coding sequence t 186-1190 ~ 



PCT/US02/12476 



1 
I 

GAATTCQGCA 
CACCAGTTTC 
CCXXIGGCGTC 
CCTCCATGTT 
CGCCAGGAGG 
GTAAGGGGGG 
CCATCAAAGT 
CATGCCCACT 
TGATCCGCCT 
CTTTGOCOGC 
CAAGCCGCTG 
TTGTCCATCG 
AACTCATTQA 
GOACAAGGGT 
CCACTGTCTG 
A6AGG6ACCA 
GCTGTGCCCT 
AGATCCTGCT 
AAAGGAGGCC 
TGGCCCCCAA 
GTTGACTTG6 
ATTGAGGATC 
CAAAG6A6CC 
CTCATTTTGC 
TTATTTTGAT 
TCATAT6CTT 
AQTAAAGGGA 
TCAGCCCAGG 

CTGGTQAGAA 
CACCACCAGA 
GCTTGCTGTT 
CTGAGCCGGG 
TCCAAGTGTG 
CCI^ATTTA 



11 

1 

CGAGCX3CGCX3 
TCTGCTTTCC 
CAOGCCCTGC 
GACCAAGCCT 
CAA6GATC6G 
CTTTGGCACC 
GATTCCCCGG 
CGAAGTCGCA 
GCTTGACTGG 
0CAG6ATCTC 
CTTCTTTGGC 
TGACATCAAG 
TTTTGGTTCT 
6TACA6CCCC 
OTCACTGGaC 
GQAGATTCT6 
AATCCGCCGG 
GGACCCCTGG 
CTGCCCCTTT 
TOGTCAQAAG 
TTTTACAGGT 
AGGGGTTAGA 
TTCCTCCCAG 
TAAGGAAGTT 
GATGTGTCAC 
TTACTTGGGC 
CCCTTTCCCC 
ATTTTTTATT 
TTTTTTTTTG 
GAACCTTAAT 
CAATAGGATG 
TGTTTTCCTG 
ATTGTCX^AT 
CCCTCCTTTT 
ATAAAAGTAA 



21 . 
I 

GCQAATCTCA 
ACCCTGGCGC 
GGGCTTAGCG 
CTACAGGGGC 
GAAGGGTTGG 
GTCTTCX3CAG 
AATCGTGTGC 
CTGCTATGQA 
TTTGAGACAC 
TTT6ACTATA 
CAA6TAGTGG 
GATGAGAACA 
GGTGCCCTGC 
CCA6AGTGGA 
ATCCTCCTCT 
GAAGCTGA6C 
TGCCTGGCCC 
ATGCAAACAC 
GGCCT6GTCC 
AGCCATCCCA 
CATTACCAOT 
AGACATAAAC 
AACCTGTGGT 
TATTTTGGTG 
CCCACATTG6 
AAGGGTGCTT 
TAGCXTTAGGG 
TT06GGGAGG 
GGTGAGGGGA 
TCCATAATTT 
GGATGGATGG 
QGGCGCTCCC 
TACTAAAATG 
TTTTCCTQCC 
TAOAATCAGA 



31 

I 

ACGCTGCGCC 
CCCCCAGCXX; 
GGTTCAGTOO 
CTCCCGCGCC 
AGGCOGAGTA 
GACACCGCCT 
TGGGCTGGTC 
AAGTGGGTGC 
AGGA2VG6CTT 
TCACA6A6AA 
CAGCCATCCA 
TCCTGATAGA 
TTCATGATGA 
TCTCTCGACA 
ATGACATG6T 
TOCACTTCCC 
CCAAACCTTC 
CAGCCGAGGA 
TT6CTACCCT 
TGGCCATGTC 
CATTAAAGTC 
CAAGTTTGCC 
CCCTGATTTT 
AAGTTGTTCC 
CACCTGCTAC 
TCCTTCCAAT 
TCCCATATTG 
TAATGCCCTG 
CCCTACTTTG 
GOOAAGGAAT 
TTTTTTGGGG 
TCCAATTTTG 
TAAATAATCA 
TOGATTATTT 
AAAAAAAAAA 



41 

I 

GTCTGCGG6C 
TG6CT0CCCA 
GCTCAATCTG 
CCCCGGGACC 
TGGACTCGGC 
CACAGATOGA 
CCCCTTGTCA 
AGGTGGTGGG 
CATGCTGGTC 
GGGCCCACTG 
6CACTGCCAT 
CCTAOGCCGT 
ACCCTACACT 
CCAGTACCAT 
GTGTGGGSAC 
AGCCCATOTC 
TTCCXX3ACCC 
TGTTACCCCT 
AAGCCT6GCC 
ACAGGGATAG 
CAGTATTACT 
CAGTTCCCTT 
GGAGGGGGAA 
CATTTTGAGC 
TACCACCACA 
ACCOCAGTAG 
GGTCAAGCTG 
TTGTTAOXC 
TTATCCCAAG 
GGAAGATGGA 
GATGG6CTAG 
CAGATTTTTG 
CGTATTGTGG 
AAAAAGCCAT 
AAAAAAAA 




CTCCAGGTGG 
GACTCAGTCA 
CACCCTGGCG 
CrCGAGCGGC 
GGTGAAGGCC 
TCCCGTGGAG 
GGCTGTGCCA 
GACTTTGATG 
GCkCTCCCGG 
ATTCCCTTTG 
TC!CCCAGACT 
TCACTGGAAO 
CAACCCCTCC 
TOGCCTGGCC 
ATGGACATTT 
AAGGTAAG6G 
CCCAATCCTA 
CTTCTTGCTT 
CCCGGGACTC 
CAAM^TTAGT 
CTTTTATTTT 
CTTACCTGCC 
AAGGCTTCTT 
TGCTCTTATT 
CACCAOOGGA 
G6GAAATAAG 
CAACCrCCTC 
GGAGGGGAGT 
GTGTGGAAAC 



Seq ID NO; 170 Protein sequence: 
Protein Accession #1 NP_00€866 



1 
I 

HLTKPLQGPP 
KVIPRNRVLG 
PAQDLFDYIT 
IDFGSGALLH 
DQEILEAELK 
RPCPFGLVLA 



11 

1 

AFP6TPTPPP 
WSPLSDSVTC 
EK6PLGEGPS 
DBPYTDPDGT 
PPAHVSPDCC 
TIiSLAWPGIA 



21 31 

I I 
GGKDREAFBA BYRLGPLLGK 
PLEVALIiWKV GAGGGHPGVX 
RCPFGQWAA IQHCHSRGW 
RVYSPPEWIS RHQYHALPAT 
ALZRROiAPK PSSRPSLEBI 
PNGQKSBFMA MSQG 



41 51 

I I 
GGFGTVFAGH RLTDRLQVAI 
RLLDWfETQE GFMIfVIiERPL 
HRDIKDENIL IDItRRGCAKL 
VWSLGILLYD MVCGDIPFER 
IiLDPHMQTFA EDVTPQPLQR 



Seq ID NO: 171 DNA sequence 
Nucleic Acid Accession #1 NM_003646 
Coding sequence : B 9 . . 2 8 75 ^ 



GCGGCGCGGA 
GCCCTCCGCC 
GAGCAGCGAC 
CGAGCCGGAC 
CGGGCACAG6 
CCCT6GGGCC 
GTCAGCGACA 
GGTTGGGGAG 
AGCCTGCAAG 
CT6TAAGCC6 
6CACCACTGG 
CCAGCAGAAG 



11 
I 

GCGGGC6TGC 
GGCCX3GGGCT 
TCCGAGTCG6 
AAG6CGCCGC 
AAAGCCATCA 
CCGTGCAGCG 
TATGG6GAGC 
CAGTACTGTG 
ATTGTGGTGC 
TOCTTCOGTG 
GTACACAGAC 
TTCACCTTCC 



21 

I 

TGAGCCCCGG 
AGGGCCGGAT 
CTTCCGCCTC 
GGGGACTCAA 
CCAAGTCGGG 
AGTCAQAGCG 
ACATCTGGTT 
TAGCCAGGAT 
ACACGCCCTG 
AATCAGGCTC 
6ACGCCAGGA 
ACAGCAA6GA 



31 
I 

COGCOGGCCC 
GGAGCOG06G 
GTCCAGCG6C 
CAAGCGGCGC 
CCTCCA6CAC 
GCAGATCCGG 
CGAGACCAAC 
GCTGAAGTCA 
CATCX5AGCA0 
CAGGAATGTC 
GQGCAAGTGT 
GATTGTGGCC 



41 

1 

ggcatgggoq 
gacggtagcc 
tccgIagcgcg 

TTCCCG6GGC 
CT6GCCCCCC 
AGTACAGTGG 
GTGTCCGGGG 
GT6TCT0GAA 
CTGGAGAAGA 
G6G6AGCCAA 
CGGCACTQTG 
ATCAOCTGCT 



51 

1 

TCTCCCGCGG 
CGGAGGCCCX3 
ACQCCGGTCC 
TGCGGCTCTT 
CTCCGCCCAC 
ACTGGAGCGA 
ACTTCTGCTA 
GAAAGTGCGC 
TAAATTTCCG 
CCTTTGTAOG 
GGAAGGGATT 
CGTGGTGCAA 



60 

120 

leo 

240 
300 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 



60 
120 

IBO 
240 
300 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 



251 



wo 02/086443 

GCAGGCATAC CACAGCAAGG TGTCCTGCTT CATGCTGCAQ CAGATCGAGQ AGCCGTGCTC 780 

GCTGGOGGTC CACOCAGCCG TGGTCATCCC GCCCACXrPGG ATCCTCCGCQ CCGGOAGQCC 840 

CCA6AATACT CTGAAAGCAA GCAAOAAGAA GAAGAGGGCA TCCTTCAAGA GGAAGTCCAG 900 

CAAGAAAGGG CCTGAGGAGG GCCX3CTGGAG ACCCTTC31TC ATCAOGCCCA CCCCCTCCCC . 960 

GCTCATGAAG CCCCTGCTGG TOTTTGrGAA CCCCAAGAGT GGGGGCAACC AGGGTGCAAA 1020 

OATCATCCAG TCTTTCCTCT GQTATCTOUV TOCCCGACAA GTCTTCGACXr TGAGCCAGGG 1080 

AGG6CCCAAG aASGOSCTGG AGATGTACXX3 CAAAGTGCAC AACCTGCGGA TCCTG6CGTG 1140 

CGGGQGOGAC GGCA0G6T6G GCTQQATCCT CTCCACCCTG GACCAGCTAC GCCTGAAGCC 1200 

GCCACCCCCT GTTGCCATCC TGCCCCTGGG TACTGGCAAC GACTTGGCCC GAACCCTCAA 1260 

CTGGGGTGGG GGCTACACAG ATGAGCCTGT GTCCAAGATC CTCTCCCACG TGQAGGAGG6 1320 

<3AAC3GTGGTA CAGCTGGAOC GCTGGGACCT CCACGCTGAG CCCAACCCOO AQGCAGGOCC 1380 

TGAGGACCGA GATGAAGGCG CCACCGACCG GTTGCCCCT6 GATGTCTTCA ACAACTACTT 1440 

CAGCCTCGGC TTTGACGCCC AOGTCACXXT GGAGTTCCAC QAGTCTCGAQ AGGC CAACCC 1500 

AGAGAAATTC AACAGCCGCT TTCGGAATAA GATGTTCTAC GCCGSGACAG CTTTCTCTGA 1560 

CTTCCTGATG GGCAGCTCCA AGGACCTGGC CAAGCACATC COAOTGOTGr GTGATOGAAT 1620 

GGACTTGACT CCCAAGATCC AGQACCTQAA ACCCCAGTOT GTTGTTTTCC TGAACATCCC 1680 

CAGGTACTGT GCGGGCACCA TGCCCTGGQG CCACCCTGGG GAGCACCACG ACTTTGAGCC 1740 

CCAGCGGCAT GACGACGGCT ACCTCGAGGT CATTGGCTTC ACCATGAOST CGTTGGCCGC 1800 

GCTGCAGGTG GGCGGACACG GCGAGOGGCT QACGCAQTGT OGOGAGGTGG TGCTCACCAC 1860 

ATCCAAGGCC ATCCC6GTGC AGGTGGATGG OOAGCGCTOC AA6CTTGCAG CCTCA06CAT 1920 

CCGCATC3QCC CTGOGCAACC AGGCCACCS^T GGTGCAGAAG GCCAAGCGQC GGAGOGCCGC 1980 

CCCCCTGCAC AGCGACCAGC AGCOGGTGCC AGAGCAOTTG OGCATCCAGO TQAGTOGCGT 2040 

CAGCATGCAC GACTATGAGG CCXITGCACTA OGACAAGGAa CAQCTCAAOG AGGCCTCTGT 2100 

GGGGCTG6GC ACTGTGGTGG TCCCAGGAQA CAGT6ACCTA GAGCTCTGCC GTGCCCACAT 2160 

TGAGAQACTC CAGCAGGAGC CGGATOaTOC TGGAOOCAAa TOCCOQACAT GCCAGAAACT 2220 

GTCCCXrAAO T6GTGCTTCC TGGACOCCAC CACT6CCA0C C6CTTCTACA GOATCQACOQ 2280 

AGCCCAGGAG CACCTCAACT ATGTGACTQA GATCGCACAG GATGAGATTT ATATCCTGGA 2340 

CCCTGAGCTG CTGGQGGCAT OGGOCOGGCC TGACCTCCCA ACOCCCACTT CCCCTCTCXjc 2400 

CACCTCACCC TOCTCACCCA CX5CCC0GGTC ACTGCAAGGG GATGCTGCAC CCCCTCAAGG 2460 

TOAAGAGCTG ATTGAGGCTG CCAAGAGGAA C3GACTTCTGT AAGCTCCAGG A6CTGCACG6 2520 

A6CT6GGG6C GACCTCATGC ACCOAGACGA GCA6AGT0GC AjGGCTCCTGC ACCACGCAGT 2580 

CAGCACTGGC AGCAAGGATX3 TGGTCCGCTA CCTGCTGGAC CACX3CCCCCC CAGAGATCCT 2640 

TOATGOOQTO GAGGAAAACG GGGAGACCTG TTTGCACCAA GCA6CGGCCC TGGGCCAGOG 2700 

CACCATCTGC CACTACAT06 TGGAG6CGGG GGCCTOSCTC ATGAAGACAO ACCAGCA6G6 2760 

C6ACACTC0C CGGCAGOGOG CTQAQAAGGC TCA6GACA0C 6AGCTGQC00 CCTACCTGGA 2820 

GAACOGGCAG CACTACCAOA TOATCCAGOQ GOAGGACCAG GAGAOGGCTG TGTA0G6GGC 2880 

8eq ID NOt 172 Protein sequence i 
protein Accesaion #t NP_003637 

1 11 21 31 41 51 

\ \ \ \ ^ ^ 

MEPRDGSPEA RSSD5ESASA SSSGSERDAG PEPDKAPRRL KXRRFPGLRL FGHSKAITKS 60 

GIiQHIiAFPPP TPGAPCSBSB RQIR5TVDHS BSATYGEBIW FETIUVSGDFC YVGBQYCVAR 120 

MUCSVSSRKC AACKZWHTP CIEQLBKIIIF RCKPSFRESG SRNVKBPTFV RHHWVERRSQ 180 

DOKCRHCQKO PQQKFTPHSK BIVAISCSWC KOAYHSKVSC PMLQQIBBPC SU3VHAAWI 240 

PPTWILSARR PQNTLKASKK KKRASFKHKS SKKGPEEGRW RPFIIRPTP9 PLMKPLLVPV 300 

NPKSGGNQGA KIIQSPLWYL NPRQVFDLSQ GGPKEALEMY RKVHNIiRILA CGGDGTVGWI 360 

LSTLDQIiRLK PPPFVAZLPL GTGNDIiARTL NWG6GYTDEP VSKILSHVEB GNWQLDRWD 420 

LHABPNPEAG PBDRDEGATD RLPLOVFNNY FSLGPDAHVT LEFHESREAN PEKFNSRFRN 480 

KMFYAGTAFS DFLM6S5KDL AKKIRWCDG MDLTPKIQDL KPQCWFUII PRYCAGTMPW 540 

6HPGEHHDPB PQRBDDGYLE VIGFTMTSLA ALQVOCSIGER LTQCREWLT TSKAZPVQVD 600 

GSPCKZiAASR IRIALRNQAT MVQKAKRRSA API£5DQQPV PEQLRIQVSR VSMHDYEAIiH 660 

YDXBQLKBAS VPLGTWVPG DSDLBLCSAH lERLQQBPDG AGAKSPTOQK L6PKHCFIJ3A 720 

TTASRFYRZO RAQSHLNYVT EIAlC^EIYIL DPELLGASAR PDLPTPTSPL PTSPCSPTPR 780 

SmODAAPPQ OBBZiIEAAKR NDFCaOiQBXiH RAGGDLMHRD EQSRTLLHKA VSTGSKDWR 840 

yUjDBAPFBZ UDAVBENGET CLKQAAALOQ RTICHYIVBA GASLMKTDQQ GDTPRQRABK 900 
AQOTEUUmi EE9RQBYQMIQ RBDQBTAV 

Seg ID NO: 173 DNA sequence 
Nucleic Acia Accession ^: AF232772 
Coding sequence: 1-1662 

1 11 21 31 41 51 

I i I I I 1 

ATGCCGGTGC AGCTGAOGAC AGCCCTGOGT GTGGTGGGCA CCAGCCTGTT TGCCCTGGCA 60 

GTGCTGGGTG GCATCCTGGC AGCCTATGTG AOGGGCTACC AGTTCATCCA CAOGGAAAAG 120 

CACTACCTGT CCTTCX3GCCT GTA0G60GCC ATOCTGGGGC T6CACCTGCT CATTCA6AGC 180 

CTTTTTGCCT TCCTGGAGCA CCGG06CATG CGACGTGCOG GCCAGGCCCT GAA6CTGCCC 240 

TCCCOGOGGC QGQGCTCGGT GGCACTGTGC ATTGCOGCAT ACCAGGAGGA CCCTQACTAC 300 

TTGC3GCAAGT GCCTGCGCTC GGCCCAGCGC ATCTCCTTCC CTGACCTCAA GGTGGTCATG 360 

6TG6TGGATG GCAACGGCCA GGAGGACX3CC TACATGCTGG ACATCTTCCA CGAGGTGCTG ' 420 

GGC6GCAC0G AGCAGGCCGG CTTCTTTQTG TGGCGCAGCA ACTTCCATQA GGCAGG06AG 480 

GGTGAOAOGO A66CCAGCCT GCAGGAGGGC AT66ACG6T6 TGOGGGATGT GGTGGGGGCC 540 

AGCACCTTCT OGTGCATCAT GCAGAAGTGQ GGAGGCAAGC GCGAGGTCAT GTACAOGGCC 600 

TTCAAGGCCC TCGGOGATTC GGTGQACTAC ATCCAGGTGT GOQACTCTGA GACTGTGCTG 660 

GATCCAGCCT 6CACCAT0GA GATGCTTCGA GTCCTGGAGG AGGATCCCCA AGTAGGGG6A 720 

GTGSGGGQAG ATGTCCAGAT CCTCiAACAAG TACGACTCAT GGATTTCCTT CCTSA0CA6C 780 

GTGC6GTACT GGATGGCCTT CAAGGTG6AG 06G6CCTGCC A6TCCTACTT TGGCT6T6TG 840 

CAGT6TATTA GTGGGCCCTT GGGCATGTAC C6CAACAGCC TCCTCCAGCA GTTCCTGGAG 900 

GACTGGTACC ATCAGAAGTT CCTAGGCAGC AAGTGCAGCT TCGGGGATGA CCGGCACCTC 960 

ACCAAC0GA6 TCCT6A6CCT TGGCTACOGA ACTAAGTATA CCGCGOGCTC CAAGTGCCTC 1020 

ACAGAGACCC CCACTAAGTA CCTC0GGI6G CTCAACCAOC AAACC06CT0 GA6CAAGTCT 1080 

TACTTCOGGG AGTG6CTCTA CAACTCTCTG TGGTTCCATA AGCACCACCT CTQGATGACC 1140 

TAOGAGTCAG TGGTCACGGG TTTCTTCCCC TTCTTCCTCA TTGCCACGQT TATACAGCTT 1200 

TTCTACCGGG GCCGCATCTG GAACATTCTC CTCTTCCTGC TGAOQGTQCA GCTGGTGG6C 1260 

ATTATCAAGG CCACCTACGC CTGCTTCCTT CXCGGCAATG CAGAGATGAT CTTCATQTCC 1320 
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CTCTACTCCC TCCTCTATATC GTCCAGCCTT CTGCCGGCCft AGA.TCTTTGC CATTGCIACC 1380 

ATCAACAAAT CTGGCTG666 CACCTCTGGC CQAAAAACX31 TTGTaGTaAA CTTCATTG6C 1440 

CTCATTCCTG TGTCCATCTG GGTGGCA(3TT CTCCTG6AGG G6CTG6CCTA CACAGCTTAT 1500 

TGCCAGQACC TGTTCAGTGA GACAGAGCTA GCCTTCXOTQ .TCTCTGGGGC TATACTGTAT 1560 

GGCTGCTACT GGGTGGCCCT CCTCATGCTA TATCTGGCXX TCATCGCCCX5 GCGATGTGGG 1620 

AAGAAGCCGG AGCAGTACAG CTTGGCTTTT GCTGAGGTGT GACATGGCCC CCAAGCAGAG 1680 

OGGGTAAAGT GCAATGGGTA AOGGAGSGAA GGG6AATGQA AGAGAAAAGA CAGGGTGG6A 1740 

GGGAGGAGGG AGTGCTGTGT TTTAGTCTCT TAATGGTCCA AAGGACAAAT CTAAAATGCA 1800 

AAGAACX3GTG ATGTAGTATG GCCTGACAGC TCTGTTTAGA GGAGGCAACA CTGATCCCCC 1860 

AGATX3CAGGG CTGCAGGGGA TTCTGTGTTT TCAGACTGCC TGTCTGCTTG CATCTGCACA 1920 

TAG6CA6TAG CCTCCTCCTG G6CTCCAGAG GGCACTCAGA AGTTGTGCTA AACCAAGTTA 1980 

AGTCOCATTC ASTGGCAACT TGTGATA66T ACCTGA6TGA CQQCAAGCTG CGGAAGGAGG 2040 

TTCTCXX»GC CCATCTQAAC ACAACCAGA6 QTG6CA6GA6 AATTTCTACT GA60QA6GTG 2100 

GGCCGGTTAG TGTATGTCAC CCCCACCCCA CCXa^TAAGTA GTCATCAATG CAATAAQATT 2160 

GCGCGTGAGA TACAAGGCCC AGAAGCCT6A TCTTT6GGCA TCAGAAAACA GGGTCCAGGA 2220 

ATG6T6CTTT ATGTGAGATA CCCCACTCCA CATCAACATT CCAGGGATGA GCCAAACCAG 2280 

CAGGGAQTTA GCACTGAACT GCTTTTAAAA OTQCACATTA AAAAGOAAAG TTTGOCAGGA 2340 

GGAACAAAlSA GATTGTGGTG GT6CTAAAG6 AGGOCATAAG CTACAGAGAG 6CCTTGG6T6 2400 

TTCCACCTGG AAACTGCTCA GACGTCTAGA TGGGTTCTTA GCTTGTCTGT GATCTCTGCT 2460 

GGGGAGATAA AAAGATTAAG CCCCAACATG TTCAGAAAAG AAGTGAAGTC TTGGGTATTT 2S20 

TAACCTGTAT ACTCTTGAAT TCCTCTCAAA TTCAGCTCTG ATCTGAGGCT AAGACACACT 2580 

CCGCACTTCA CTTTCTTCAA AGCCSKATTT TTTGAGGTAT CACTGCAGTC ACCTCTTCTA 2640 

CCCTCATCAT CATAGGTAAG GTrTTCAAOG TGGGAATTG6 G6G6GAGCCC CGGCTTCTTA 2700 

TAGAAGCTTC AGCAG6AGGC AAGC6TGTTC TCAGCACATA TGGGAACTAT GAGGAGCCTC 2760 

TGATCAAATT GGCTACAATC TTGGAGCTGC TTGGACX3GAT TCCTTGGCAG CCGGGTTAGC 2820 

ATGTGTGACT TTCAGGCTAC TGTTCTTGAC AATCATCTCC AATGGAAAGC TTTTCAGTGT 2880 

TCCCAAAGTG AACTCTCAAA TCCAAAATGG TTATCTTTGA GACCATCCAT TCTCCTCAGT 2940 

GGCTTCTCCA QGGAATTCTT ACAGCCAAGT TGTGACAGTC ACTGCATTTG CCTGCTTCTT 3000 

TCCAOAAACC AAACTAGGAG ATGAAACTGG TTCCTACATC CTAAGGTTCT T6CTTTCTCT 3060 

CTCATGCCTC CTGAGGCTGT TTTTGGCTGT TTTCCCTCTG CTGCTTTTGG GGAATGAGGG 3120 

GAAGCCATTT TCCAAGTGAC TTGCAATCCA GGCTGTTCTC A6CGTTTTGA GTTTAAAACC 3180 

TG6GATCCTG ACTAAGCCTT TGACTTAAGQ GTTGCTTGCT TGCCCTCCAA ATGTCCTTTC 3240 

TCAAAGGGGC CAACTAACCX: GTGCAQAACC AGCACTAAGG TGGACASCAG ACAAGAGGGC 3300 

AA6CCTCTAA TGTACCAAGT GCTTCCTACA AAOACGCAAG GTGT6CTCCG AACXACAGAT 3360 

GGGCAAACCC TGGTGCTTTC CTTCATCTCC CAGSAACTCA AGGOTTTTCC AAGTOTAGCT 3420 

AACAGTTGCC ACATCACACA GACCTCCA6T TTCTGGTAAG ACTGCT6GTT GACATCAGAC 3480 

CXIAACCCATT GAAGGCTGQA AGGCAGCAGG CATTT6CTAA GGCAGCTGAT CCAGGCAATC 3540 

GTTCTGCTGG CCAAGAAGTT AAACTATTTT GAGCATTAGA ATGOAGGAAA TCCGGTCAGC 3600 

CAAGT6CAGA 6TTCAGACTT C6CTA2^GGGC TTGTTTTTGT TGAGCATTTA CTT6AAGATT 3660 

AATGTAGGAT GACAG6CTCT CCTGGCTGTC CTACCATCAG CTCTGCCTTQ CACTGTGGTC 3720 

GTCAACTTTC CTCAAATCAA AAACAG6CAG GTACAGGTAG TGGGCTCACA ACGTTTGACC 376Q 

TCGACTGGTT TTTCTAAGTT ATTTTGTACA TTTTTCAGCA GCAAAACCAA ACTGGGTCTT 3840 

CAGCTTTATC CCCGTTTCTT GCAAGGGAAG AGCCTTTATA O^TTGGACG CATTTTGGTT 3900 

TTTCCTCATT GAGAATTCAA ATCCTCTTTT GTATTGTTTC TACAATAATT TGTAAACATA 3960 

TTTATTTTTA CCTGCTTTTT ttTTTTTTTT TAATTTTCAG GTCAAGTTTT TTATACTGCA 4020 
CTTATTTGTC AAAATAAA6A TTCTCACAT 

8eq ID NO: 174 Protein sequence i 
Protein Accession #: AAF36984 

1 11 21 31 41 51 

I 1 I I I I 

MPVQLTTALR WGTSLPALA VLGGILAAYV TGYQPIHTEK HYLSFGLYGA ILGLHLLIQS 60 

LFAFLBHRRM RRAGQALKLP SPRRGSVALC lAAYQEDPDY LRKCLRSAQR ISFPDLKWM 120 

VVDQNRQ2DA YMLDIFBBVL GGTEQAQFFV MRSNPHEAGB GETBASLQEG MDRVRDWRA 180 

STFSCZMQKH 6GKREVMYTA FKAL6DSVDY ZQVCDSDTVL DPACTIEHLR VLEEDPC3VGG 240 

VGGDVQIINK YDSVilSFLSS VRYflMAFNVE HACQSYPGCV QCISGPLGMY RMStiLQQFLB 300 

DWYHQKPLGS KCSFGDDRHL TNRVLSLGYR TKYTARSKCL TETPTKYLRW LNQQTRWSKS 360 

YPREWLYNSL WFHKHHLWMT YESWTGPFP FFLIATVIQL FYRGRIWNIL LFLLTVQLVG 420 

IIKATYACFL RGNAEMZFMS liYSUiYHSSL LPAKIFAIAT INKSGWOTSG HKTIWNFIG 480 

LIPVSIHVAV IiLEGLAYTAY OC^IiFSETEL AFLVSGAILY GCYHVAIiIiHL YLAIZARRCG 540 
KKPEQYSUVF AEV 

Seg ID NO: 175 DNA sequence 
HUcleic Acid Accession #: NM_000691 
Coding sequence : 43 . . 1404 



1 11 21 31 41 51 

) I 1 I i 1 

CCAGGAGCCC CAGTTACOGG GAGAGGCTGT 6TCAAAGG0G CCATGAGCAA GATCAGG6AG 60 

GC06T6AAGC GOGCCC6CX3C GGCCTTCAGC TGGGGCAGGA CCCGTCOGCT GCAGTTCOSA 120 

TTCCAGCA6C T6QAG60GCT GCAOCGCCTG ATCCAG6AGC AGGAGCAGGA 6CTGGTG6GC 180 

GCGCTGGCCG CAGACCTGCA CAAGAATGAA T6GAAG6CCT ACTAT6AGGA GGTGGTGTAC 240 

GTCCTAGAGG AGATCGAGTA CATGATCCAG AAGCTCCCTG AGTGGGCCGC GGATGAGCCC 300 

GTQGAGAAGA OGCCCCAGAC TCAGCAGGAC GAGCTCTACA TCCACTCGGA GCCACTGGGC 360 

GTGGTGCICG TCATTGGCAC CTGGAACTAC CCCTTCAACC TCACCATOCA GCCCATGGTG 420 

GGGGCCATCG CTGCAGGGAA GGCAGTGGrC CTCAA6CCCT CGGAGCTGAG TGAGAACAT6 480 

GCGAGCXn^GC TGGCTACCAT CATCCCOCAG TACCT6GACA AGQATCTGTA CCCAGTAATC 540 

AATGGGGGTG TCCCTGAGAC CACGGAQCTG CTCAAGGAGA GGTTCGACCA TATCCTGTAC 600 

ACGGGCAGCA CQGGGGTGGG GAAGATCATC ATGACGGCTG CTGCCAAGC31 CCTGACCCCT 660 

GTCACGCTGG AGCTGGGAG6 GAAGA6TCCC TGCTA06TGG ACAAGAACTG TGACCTGGAC 720 

GTQGCCT6CC 0ACGCATG6C CIGGGGGAAA TTCATGAACA OiaGCCAQAC CTGCGTGGCX: 780 

CCAGACTACA TCCTCTOTGA COCCTG6ATC CAQAACCAAA TT6T0GA6AA 6CTCAASAA6 840 

TCACTGAAAG AGTTCTACGG GGAAGATGCT AAGAAATCCC GGGACTATGG AAGAATCATT 900 

AGTGCCaSGC ACTTCCAGAG G6TGAT6GGC CTGATTGAGG GCCAGAAGGT GGCTTATGGG 960 

GGCACXXK3GQ ATGCCGCCAC TCGCTACATA 6CCCCCACCA TCCTCACGGA CX3TGGACCCC 1020 
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CA6TCCCCG6 TGATGCAAGA GGAGATCTTC 
AGOCTGGAGO AGQCXATCCA QTTCATCAAC 
TTCTCCAQCA AOaACAAOOT OATTAAOAAO 
GCGGCCAACX5 ATGTCATCQT CCACRTCACC 
AACAGCGGCA TGGGATCCTA CCATGGCAAG 
TCTTQCCTGG TGAGGCCTCT GATGAATGAT 
CCGGCCAAOA TGACCCAGCA CTGAGSUSGG 
CCCATGGGAG T6CGC3ACCAC CCTCACTGGC 
CCCCAGCCCA GCCCCACTCC TCTGCTGAGC 
GGGCCCAGGC CTCACCATTC CAAGTCTOCA 
CAATTTTCTA ACTOGG 

Seq ID NO: 176 Protein sequences 
Protein Accesoion #» NP_0006S2 

1 11 21 31 41 51 

[11)11 

MSKISBAVKR ARAAPSSGRT RPLQFRFQQIi BALQHLIQBQ EQBLV6ALAA DLHKNEWMAY 60 

YEEWYVLEE IBYMIQKLPE WAADBPVEKT PQTQQDELyi HSEPIXWVLV IGTWMY PFML 120 

TIQPMVGAIA AGNAWLKPS ELSBNMASLIi ATIIPQYIiDK DLYPVINOGV PETTBliIiKER 180 

FDHILYTGST GVGKIIMTAA AKHLTPVTLE LGGKSPCYVD KNCDXiDVACR RIAWGKFMMS 240 

GOTCVAPDYI LCDPSIQNQI VEKLKKSLKE FYGEDAKKSR DYGRIISARH FQRVMGLrBG 300 

QKVAYGGTGD AATRYIAPTI LTDVDPQSPV MQEEIFGPVL PIVCVRSLEE AIQFINQREK 360 

PLALYMFSSN DKVIKKMIAE TSSGGVAAND VIVHITLHSL PFGGVGNSGM GSYHGKKSFE 420 
TFSHSRSCLV RPLMEIDSGLK VRYPPSPAKH TQH 

Seq ID HOt 177 DNA sequence 

Nucleic Acid Accession #s NN_001067.1 

Coding sequence; 108-4703 

1 11 21 31 41 51 

I I I i I 1 • 

CTAACCGACG CGCGTCTGIG GAGAAGCGGC TT6GT0GGG6 GTGGTCTCGT GGGGTCCTGC 60 

CTGTTTAOTC GCTTTCAGGG TTCTTGAGCC CCTTCACXSAC CGTCACCATG OAAGTGTCAC 120 

CATTOCAGCC TGTAAATGAA AATAT6CAA6 TCAACAAAAT AAAGAAAAAT GAA GATGCTA 180 

AGAAAAGACT GTCTGTTGAA AGAATCTATC AAAA6AAAAC ACRATTGGAA CAT ATTTT GC 240 

TCCGCCCAGA CACCTACATT GGTTCTGTGG AATTAGTGAC CXAGCAAATG TGGGTTTACG 300 

ATGAAGATGT TG6CATTAAC TATAGGGAAG TCACTTTTGT TCCTGGTTTG TACAAAATCT 360 

TTGATGAGAT TCTAGTTAAT GCTGOGGACA ACAAACAAAG GGACCCAAAA ATGTCTTGTA 420 

TTAGAGTCAC AATTQATCOO QAAAACAATT TAATTAOTAT ATGGAATAAT QQAAAftGGTA 480 

TTCCT6TTGT TGAACACAAA GTTGAAAAGA TGTATGTCCC AGCTCTCATA TTTGGRO^GC 540 

TCCTAACTTC TAGTAACTAT GATGATGATG ARAAGAAAGT GACAGGTGGT CGAAATGGCT 600 

ATGGAGCCAA ATTGTGTAAC ATATTCAGTA CCAAATTTAC TGTGGAAACA GCCAOTAGAG 660 

AATACAA6AA AAT6TTCAAA CAGACATGGA TGGATAATAT GGGAAGAGCT G GTGAG ATGG 720 

AACTCAAGCC CTTCAATQGA GAAGATTATA CATOlATCaC CTTTCAOCCT GATTTGTCTA 780 

AGTTTAAAAT GCAAAGOCia OACAAAGATA TTOTTGCACT AATGGTCAaA A6A6CATATG 640 

ATATTGCTGG ATCCACCAAA GATGTCAAAG TCTTTCTTAA TGGAAATAAA CTGCCAGTAA 900 

AAGGATTTOS TAGTTATGTG GACATGTATT T6AAGGACAA 6TTGGATGAA ACTGGTAACT 960 

CCTTGAAA6T AATACAT6AA CAAGTAAACC ACAGGTGGGA AGTGTGTTTA ACTATGAGTG 1020 

AAAAAG6CTT TCRGCAAATT AGCTTTGTCA ACAGCATTGC TACATCCAAG GGTGGCAGAC 1080 

ATGT TG ATTA TGTAGCTCAT CAOATTGTGA CTAAACTTGT TGATGTTGTG A AGAAGAA GA 1140 

ACAAGGGTGG TGTTGCAGTA AAAGCACaTC AGGTGAAAAA TCACATGTGG ATTTTTGTAA 1200 

ATGCXTTTAAT TGAAAACCCA ACCTTTGACT CTCAGACAAA AGAAAACATO ACTTTACAAC 1260 

CCSUMSAGCTT TGGATCAAC» TQCOATTGh GTGAAAAATT TATCAAA6CT GCCATTGGCT 1320 

tfl GG T A TlCT AGAAAOCATA CTAAACIGOG TQAA6TTTAA G6CCCAAGTC CAGTTAAACA 1380 

AfiAAGTGTTC AGCTGTAAAA CATAATAGAA TCAAGGGAAT TOCCAAACTC GAT6ATGCCA 1440 

ATGATGCAGO GQGCCQAAAC TCCACTGAGT 6TACGCTTAT CCTGACTGAG GGAQATTCAG 1500 

CCAAAACTTT GGCTGTTTCA GGCCTTGGTG TGGTTGGGAQ AGACAAATAT GGGGTTTTCC 1560 

CTCTTAGAGG AAAAATACTC AATGTTaSRG AAGCTTCTCA TAAGCAGATC ATGGAAAATG 1620 

CTGAGATTAA CRATATCATC AAGATTGTGS GTCTTCAOTA CAAGAAAAAC TAT6AAGATG 1680 

AAGATTCATT GAAGACGCTT CGTTATGGSA A6ATAAT6AT TATGACAGAT CAGGACCAAG 1740 

ATGGTTCCCA CATCAAAGGC TTGCTGATTA ATTTTATCCA TCACAACTGG CCCTCTCTTC 1800 

TCCGACATCG TTTTCTGGAG GAATTTATCA CTCXX31TTGT AAAGGTATCT AAAAACAAGC 1860 

AA6AAATGGC ATTTTACAGC CTTCCTGAAT TTGAAGAGTG GAAGAGTTCT ACTCX»AATC 1920 

ATAAAAAAT6 GAAAGTCAAA TATTACAAA6 GTTTGGGCAC CAGCACATCA AAGGAAGCTA 1980 

AAGAATACTT TGCAGATATG AAAAGACATC GTATCCAGTT CAAATATTCT GGTCCTGAAG 2040 

ATGATGCTGC TATCA6CCTG QCCTTTAGCA AAAAACAGAT AOATGATGQA AAGGAATGGT 2100 

TAACTAATTT CATGGAGGAT AGAAGACAAC GAAAGTTACT TGGGCTTCCT GAGGATTACT 2160 

TGTATGGACA AACTACCACA TATCTGACAT ATAATGACTT CATCAACAAG GAACTTATCT 2220 

TGTTCTCAAA TTCTGATAAC GAGAGATCTA TCCCTTCTAT GGTGGATOGT TTQAAACCAG 2280 

GTCAOAGAAA GQTTTTGTTT ACTTGCTTCA AACX3GAATGA CAAGCGAGAA GTAAAGGTTG- 2340 

CCCAATTAQC TGGATCAGTG GCTGAAATGT CTTCTTATCA TCATGGTQAG AT GTCACTA A 2400 

TGATGACCAT TATCAATTTG GCTCAGAATT TTOTOGGTAO CAATAATCTA AACCTCTTGC 2460 

AGCCCATTGG TCAGTTTGGT ACXAGGCTAC ATGQTGGCAA GGATTCTGCT AGTCCAOGAT 2520 

ACATCTTTAC AATGCTCAGC TCTTTGGCTC QATTGTTATT TCCACCAAAA 6ATGATCACA 2580 

COTTGAAGTT TTTATATGAT GACAACCAGC GTGTTGAOCC TGAATGGTAC ATTCCTATTA 2640 

TTCCCaTGGT GCT6ATAAAT GGTGCTGAAG 6AATCGGTAC TGGGTGGTOC TGCAAAATCC 2700 

CCAACTTTGA TGTGGQTGAA ATTOTAAATA ACATCAGGGG TTT6ATGGAT GQA0AA6AAC 2760 

CTTTCCCaUVT GCTTCCAAGT TACAAGAACT TCAAGGGTAC TATTGAAGAA CT6GCTCCAA 2820 

ATCAATATGT GATTAGTGGT GAAGTAGCTA TTCTTAATTC TACAACCATT GAAATCTCAG 2880 

AGCTTCXXXrr CAQAACATGG ACCCAGACAT ACAAAGAACA AGTTCTAGAA CCCATGTTGA 2940 

ATGGCACCGA GAAQACACCT CCTCTCATAA CAGACTATAG GGAATACCAT ACA GATACC A 3000 

CTGTGAAATT TGTTOTGAAG ATGACTGAAO AAAAACTOGC AOAGGCAGAG AQAGTTG QAC 3060 

TACACAAAGT CTTCAAACTC CAAACTAGTC TCACATGCAA CTCTATGGTG CTTTTTQACX: 3120 

ACGTAGGCTG TTTAAAGAAA TATGACACGO TGTTGGATAT TCTAAGAGAC TTTTTTGAAC 3180 

TCAGACTTAA ATATTATGGA TTAAGAAAAG AATGGCTCCT AGGAATGCTT GGTGCTGAAT 3240 

CTOCTAAACT GAATAATCAG 6CTC0CTTTA TCTTAGAGAA AATAGATGGC AAAATAATCA 3300 



GGGCCT6TGC T6CCCAT0GT 6TGCGTGCGC 1080 

CAGGQT6AGA AGCCCCTGGC CCTCTACATG 1140 

ATGATT6CAG A6ACATCCAG TOGTGGGGTG 1200 

TTGCACTCTC TGCCCTTCGG GGGCX3TGGGG 1260 

AAGAGCTTCG AGACTTTCTC TCACCGCOGC 1320 

GAA6GCCTGA AGGTCAGATA CCCCCCGAGC 1360 

GTTGCrCCGC CTG6CCTG6C CATACTGTGT 1440 

TCTCCTGGCC CTCSGAGAATC GCTCCTGCAG 1500 

T6CTGACCTG TOCACACCCC ACTCCCACAT 1560 

QCCCTTTCTA GACCAATAAA GAGACAAATA 1620 
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TTGAAAATAA 6CCTAAGAAA 6AATTAATTA AAGTTCTGAT TCA6A6GG6A TATGATTCOG 3360 

ATCCTGTGAA G6CCTGGAAA GAAGCCXAGC AAAAGGTTCC AGAT6AAGAA GAAAATGAA6 3420 

AGAGTGACAA CGAAAA6GAA ACTGAAAAGA GTGACTCCGT AACAGATTCT GGACCAACCT 3480 

TCAACTATCT TCTTGATATG CCCCTTTGGT ATTTAACCAA GGAAAAGAAA GATGAACTCT 3540 

GCAGGCTAAG AAATGAAAAA GAACAAGAGC TGGACACATT AAAAA6AAA6 AGTCCATCAG 3600 

ATTTGTGGAA AGAAGACTTG GCTACATTTA TTGAAGAATT G6A6GCTGTT GAAGCCAAGG 3660 

AAAAACAAGIl TGAACAA8TC GGACTTCCTG G6AAAGGGG0 6AAGGCCAA0 GQGAAAAAAA 3720 

CACAAATGGC TQAAGTTTTG CCTTCTCOQC 6TGGTCAAA6 AGTCATTCCA CQAATAACCA 3780 

TAGAAATGAA AGCAGAGGCA GAAAAGAAAA ATAftAAAGAA AATTAAGAAT GAAAATACTG 3840 

AAGGAAGCCC TCAAGAAGAT GGTQTGGAAC TAGAAGGCCT AAAACAAAGA TTAGAAAAGA 3900 

AACAGAAAAQ AGAACCAGGT ACAAAGACAA AGAAACAAAC TACATTGGCA TTTAA6CCAA 3960 

TCAAAAAA6G AAAGAAOAOA AATCXCTG6C CTOATTCAGA ATCSUaATAOG AGCAGTGAGG 4020 

AAAGTAATTT TGATGTCGCT CCAG6AGAAA CAGAGCCACa GAGAQCAGCA ACAAAAACAA 4080 

AATTCACAAT GQATTTGGAT TCAGATGAAG ATTTCTCAGA TTTTGATGAA AAAACTGATG 4140 

ATGAAGATTT TGTCCCATCA GATGCTAGTC CACCTAAGAC CAAAACTTCC CCAAAACTTA 4200 

GTAACAAAGA ACTGAAACCA CAGAAAAGTG TCGTGTCAGA CCTTGAAGCT GATGATGTTA 4260 

AGG6CA6TGT ACCACTGTCT TCAAGOCCTC CTGCTACACA TTTCCCAGAT GAAACIGAAA 4320 

TTACAAACCC AGTTCCTAAA AAGAAIGTGA CAGTOAAOAA GACAGCA6CA AAAAGTCAGT 43 BO 

CTTOCACCTC CACTACCGGT GCCAAAAAAA GGGCTGCCCC AAAAGGAACT AAAAGG6ATC 4440 

CAGCTTTGAA TTCTGGTGTC TCTCAAAAGC CTGATCCTGC CAAAACCAAG AATCGCCGCA 4500 

AAAGGAAGCC ATCCACTTCT GATGATTCTG ACTCTAATTT TGAGAAAATT GTTTCGAAAG 4560 

CAGTCACAAG CAAGAAATCC AAGGG6GAGA GTGATGACTT CCATATG6AC TTTGACTCAG 4620 

CTGTGGCTCC TOGGGCAAAA TCTGTACGGG CAAAGAAACC TATAAAGTAC CTGGAAGAGT 4680 

CAGATGAAGA TGATCTGTTT TAAAAT6TGA GGOSATTATT TTAAGTAATT ATCTTACCAA 4740 

6CCC3WVGACT GGTTTTAAAG TTACCTGAAG CTCTTAACTT CCTCCCCTCT GAATTTAflTT " 4600 

TG6GGAAGGT GTTTTTAGTA CAAGACATCA AAGTGAAGTA AAGCCCAAGT GTTCTTTAGC 4860 

TTTTTATAAT ACTGTCTAAA TAGTGACCAT CTCATGGGCA TTGTTTTCTT CTCT6CTTTG 4920 

TCTGTGTTTT QA6TCTGCTT TCTTTTGTCT TTAAAACCT6 ATTTTTAAGT TCTTCTGAAC 4980 

TQTAGAAATA GCTATCTGAT CACTTCRGCG TAAAGCA6TG TGTTTATTAA CCATCCACTA 5040 

AGCTAAAACT AGAGCAGTTT GATTTAAAAG TGTCACTCTT CeTCCTTTTC TACTrrCAGT 5100 

AGATATGAQA TAGAGCATAA TTATCT6TTT TATCTTAGTT TTATACATAA TTTACCATCA 5160 

6ATAGAACTT TATGGTTCTA GTACAGATAC TCTACTACAC TCAGCCTCTT ATGTGCCAAG 5220 

TTTTTCTTTA AGCAATGAGA AATTGCTCy^T GTTCTTCATC TTCTCAAATC ATCAGAGGCC 5280 

AAAGAAAAAC ACTTTGGCTG TGTCTATAAC TTGACACAGT CAATAGAATG AAGAAAATTA 5340 

GAGTAGTTAT GTGATXATTT CAOCTCTTGA OCTGTCOOCT CTGOCTGOCT CTGAGTCTOA 5400 

ATCTCCCAAA GAGA6AAACC AATTTCTAAG AGGACTGQAT TGCA6AA6AC TOGGGGACAA 5460 

CATTTGATCC AAGATCTTAA ATGTTATATT GATAACCATG CTCAGCAATG AGCTATTAGA 5520 

TTCATTTTGG GAAATCTCCA TAATTTC3UVT TTGTAAACTT TGTTAAGACC TGTCTACATT 5580 

6TTATATGTG TGTGACTTGA 6TAATGTTAT CAAG6TTTTT GTAAATATTT ACTATGTTTT 5640 
TCTATTAGCT AAATTCCAAC AATTTT6TAC TTTAATAAAA TG I TC TA AAC ATTGC 

Seq ZD NO: 178 Protein sequence: 
Protein Accession #t MP_001058.1 

1 11 21 31 41 51 

i I I ) ' i I 

MEVSPLQPVN ENMQVNKIKK NEDAKKRLSV ERIYQIOCrQL EHILLRPDTY IGSVELVTQQ 60 

MWVYDEEVGI NYREVTFVPG LYKIFDEILV NAADNKQRDP KMSCIRVTID PENNLISIWN 120 

KGKGIPWEH KVEKMYVPAL IFGQLLTSSK YDDDEKKVTG GRNGYGAKLC NIFSTKFTVE 180 

TASREY1QKMF KQTHNDNMQR AGBMELKPFN GEDYTCITFQ PDLSKFXMQS CDUDXVAUIV 240 

RRAYDZAGST KDVKVFUiGN KLPVRGFRSY VDNYLKDKLD ETGNSLKVIB EQVNHRNEVC 300 

LTMSEKGFQQ ISFVNSIATS KGGRHVDYVA CQIVTKLVDV VKKKNK66VA VKAHQVKNHM 360 

WIPVNALIEN PTFDSQTKEN MTIiQPKSFGS TCQLSBKPiK AAIGCGIVES ILNWVKFKAQ 420 

VQLHKKCSAV KHNRZKGIPK LDDANDAGGR NSTECTLILT BCaSSAKTLAV SGLGWGRDK 480 

YGVFPLRGKI USVBBASOgQ INENABHINZ IKIVGXiQyKK NYEDBDSLRT LRY6KIMIMT 540 

DQDQDQSHIK GLLXNFIHBN HPSLLR^RFL EEFITPIVKV SKNKQEMAFY SLPEFEEMKS 600 

STPNHKKWKV KYYKGLGTST SKEAKEYFAD MXSHRIQFKY SGFEDDAAIS LAFSKKQIDD 660 

RKEWLTNFME DRRQRKLLGL PEDYLYGQTT TYLTYNDFIN KELILFSNSD NBRSIPSMVD 720 

GLKPGQRKVL FTCFKRNDKR EVKVAQLAGS VABMSSYHHG EMSLMMTIIN LAQNFVGSNN 780 

LNLLQPIGQP GTRLHGQKDS ASPRYIPTML SSLAHLLFPP KDDBTUCFLY DDNQRVEPEW 840 

YIPIIPMVLI NGAEGIGTGN SCKIPHPDVR EIVMNZRRLM DGEEPLPMLP SYKNFKGTIE 900 

BLAPNQYVIS GEVAILNSTT lEISELPVRT WTQTYKEQVL BPMLNGTBKT PPLITDYREY 960 

HTDTTVKFW KMTBEKLAEA ERVGLHKVFK LQTSLTCNSM VLFDHVGCIiK KYDTVLDILR 1020 

DFFELRLKYY GLRKEWLLQI LGAESAKLNN QARFILEKID GKIIIENKPK KELIKVLIQR 1080 

GYDSDPVKAW KBAQQKVPDE EENEESDNEK ETBKSDSVTD SGPTFNYLLD MPLWYLTKEK 1140 

KDELCRLRNB KEQELDTLKR KSPSDLWKED LATFIEELEA VEAKEKQDBQ VGLPGKGGKA 1200 

KGKKTQMAEV LPSPRGQRVI PRZTIEMKAE ABKKNKKKZR HENTEGSPQE DGVEXiEGLKQ 1260 

RLEKKQKREP GTKTKKQTTL AFKPZKKSKK RNPHPD8E8D RSSDESNFDV PPRETBPRRA 1320 

ATKTKFTKDL DSDEDFSDFD EKTDDEDFVP SDASPPKTKT SPKLSNKELK PQKSWSDLE 13 BO 

ADDVKGSVPI* SSSPPATHPP DETEITNPVP KKKVTVKKTA AKSQSSTSTI GAKKRAAPR3 1440 

TKRDPALNSG VSQKPDPAKT KNRRKRKPST SDDSDSNFEK ZV8KAVT8KR SXGESDDFHM 1500 
DFOSAVAPRA KSVRAKKPZK YIiEBSDEDDL F 



Seq ZD NO: 179 DNA sequence 

Nucleic Acid Accession #: Eos sequence 

Coding sequence: 148-7095 

1 11 21 31 41 51 

I I I 1 I i 

CACACATAC6 CACX5CACGAT CTCaCTTOGA TCTATACACT GGAGGATTAA AACAAACAAA 60 
CAAAAAAAAC ATTTCCTTCG CTCCCXXTTCC CTCTCCACTC TGAGAAGCAO AGGAGCCGCA 120 
CGGCQAGGGG COGCAGACGG TCTGQAAATG OQAATCCTAA AGCQTTTGCT GGCTTGCATT 180 
CAGCTCCTCT GT6TTTGC0G CCTGGATTGG 0CTAAT6GAT ACTACAGACA ACAOAGAAAA 240 
CTTGTTGAAO AGATTGGCTG GTCCTATACA GGAGCACTGA ATCAAAAAAA TTGGGGAAAG 300 
AAATATCCAA CATGTAATAG CCCAAAACAA TCTCCTATCA ATATTGATGA AGATCTTACA 360 
CAAGTAAATG TOAATCTTAA GAAACTTAAA TTTCAGGGTT GG6ATAAAAC ATCATTGGAA 420 



255 



wo 02/086443 

AACACATTCA TTCATAACAC TGGGAAAACA GTGGAAATTA ATCTCACTAA TGACTACCGT 4B0 

GTCftGGGGftG GAGTTTCAGA AATGGTGTTT AAAGCAAGCA. AGATAACTTT TOVCTGGGGA 540 

AAATGCAATA TOTCATCTOA TGQATCAGA6 C31TAOTTTA6 AAGGACAAAA ATTTCXaCTT 600 

GAGATGCAAA TCTACTGCTT TGATGCGGAC CQATTTTCAA GTTTTGAGGA AGCAGTCAAA 660 

GGAAAAGGGA ASTTAAGAGC TTTATCCATT TTGTTTGAGG TTGGGACAGA AGAAAATTTG 720 

GATTTCAAAG CGATTATTQA TGGAGTC3GAA AGTGTTAGTC GTTrrGGOAA GCAGQCTGCT 7B0 

TTAGATCCAT TCATACTGTT GAACCTTCTO CCAAACTCAA CTGACAAGTA TTACATTTAC 840 

AATGGCTCAT TOACATCTCC TCCCTGCACA GACACAGTTG ACTGGATTCT TTTTAAAGAT 900 

ACAQTTAGCA TCTCTGAAAG CCAGTTGGCT GTTTTTTGTG AAGTTCTTAC AATGCAACAA 960 

TCTGGTTATQ TCATGCTGAT GGACTACTTA CAAAACAATT TTOGAGAGCA ACflG TACA AG 1020 

TTCTCTAGAC AGGTGTTTTC CTCATACACT GGAAAGGAAQ AOATTCATGA AGCASTTTQT X080 

A8TTCAGAAC CAGAAAftTGT TCAOGCTGAC CX3M3AGAATT ATACCAGCCT TCTTGTTACA 1140 

TG6GAAAGAC CTCGAGTOGT TTATGATACC ATGATTGAGA AGTTTGCAGT TTTGTACCAG 1200 

CAGTTGGATG GA6AGGACCA AACCAAGCAT GAATTTTTGA CAGATGGCTA TCRAGACTTG 1260 

GGTGCTATTC TCAATAATTT OCTACCCAAT ATGAGTTATG TTCTTCAGAT AGTAGCCATA 1320 

TGCACTAAT6 GCTTATATGG AAAATACAGC GACCAACTGA TTGTCGACAT GCCTACTGAT 1380 

AATCCTGftAC TTOATCTTTT CCCTGAATTA ATTGGAACTG AAGAAATAAT CAAGQAGGAG 1440 

GAASAGOGAA AAOACATTGA AGAAGGOGCT ATTGTGAATC CTGGTAGAGA CAGTGCTACA 1500 

AACCAAATCA GGAAAAAGGA ACCCCAGATT TCTACCAC3UV CACACTACAA TCGCATAGGG 1560 

ACX»AATACA ATGAAGCCAA GACTAACCGA TCCCCAACAA GAGGAAGTGA ATTCrCPGGA 1620 

AAGGGTOATG TTCCCAATAC ATCTTTAAAT TCCACTTCCC AACCAOTCAC TAAATTAGCC 1680 

ACAGAAAAAG ATATTTCCTT GACTTCTCAG ACTGTGACTG AACTGCXSVCC TCACACTGTG 1740 

6AA6GTACTT CAGCCTCTTT AAATGATGGC TCTAAAACTG TTCTTAOATC TCCACATATG 1800 

AACTTGTOGG GOACTGCAGA ATCCTTAAAT ACAGTTTCTA TAACAGAATA TGAGGAGGAG 1860 

AGTTTATTGA CCAGTTTCAA GCTTGATACT GGAGCIOAAa ATTCTTCAGG C TCCiAG TOCC 1920 

GCAACTTCTG CTATCCCATT CATCTCTGAG AACATATOCXT AAGGGTATAT ATTTTCCTCC 1980 

GAAAACCCAG AGACAATAAC ATATGATGTC CTTATACCAG AATCTGCTAG AAATGCTTCC 2040 

GAAGATTCAA CTTCATCAGG TTCAGAAGAA TCACTAAAGG ATCCTTCTAT GOAGGGAAAT 2100 

GTGTGGTTTC CTAGCTCTAC AOACATAACA GCACAGCCCG ATGTT6GATC AGGCAGAGAG 2160 

AGCTTTCrCC AGACTAATTA GACTOAGATA OQTGnrTGATO AATCTGAGAA GACAACCAAG 2220 

TCCTTTTCTG CAGGCCCAGT GATOTCACAG G6TC0CTCAG TTACAGATCT GGAAATGCCA 2280 

CATTATTCTA CCTTTGCCTA CTTCCCAACT GAGGTAACAC CTCATGCTTT TACCCCATCC 2340 

TCCAGACAAC A6QATTTGGT CTCCACQGTC AAOGTGGTAT ACTCGCAGAC AACCCAACCG 2400 

GTATAOUVTG GTGAGACACC TCTTCAACCT TCCTACAGTA GTGAAOTCTT TCCTCTAGTC 2460 

ACCCCTTTCT TOCTTGACAA TCAGATCCTC AACACTACCC CTGCTQCTTC AAGTAflTQAT 2520 

TCGGCCTTGC ATGCTACGCC TGTATTTCCC AGTGTOSATG TGTCATTTQA ATCCATCCTG 2560 

TCTTCCTATG ATGGTGCACC TTTGCTTCCA TTTTCCTCTG CTTCCTTCAQ TAQTOAATTG 2640 

TTTOGCCATC TGCATACAGT TTCTCAAATC CTTCCACAAG TTACTTCAGC TACOSAGAfiT 2700 

GATAAGGTGC CCTTGCATGC TTCTCTGCX3V 6T66CTGGGG GTQATTTGCT ATTAGA6CCC 2760 

AGCCTTGCTC AOTATTCTCA TQTGCTGTCC ACTACTCATO CT6CTTCAGA GACGCTCGAA 2820 

TTTGGTAGTG AATCTGGTGT TCTTTATAAA AOSCTTATQT TTTCTCAAGT TGAACO^CCC 2880 

AGCAGIGATG CCATGATGCA TGCAOGTTCT TCAGGGCCTG AACCTTCTTA TGCCTTGTCT 2940 

GATAATQAQG GCTCCCAACA CATCTTCACT GTTTCTTACA GTTCTGCAAT ACCTGTGCAT 3000 

GATTCTGTGG GTGTAACTTA TCAGGGTTCC TTATTTAG06 GCCCTAGCCA TATACCAATA 3060 

CCTAAOTCPT GQTTAATAAC CGCAACTGCA TCATTACTGC AG CCTAC TCA TGCXXTTCTCT 3120 

GGTGATGGGO AATQGTCTGO AGCCTCTTCT GATAGTGAAT TTCTTTTACC TOACACAGAT 3180 

GGGCTGACaG CCCTTAACAT TTCTTCACCT GTTTCTGTAG CTGAATTTAC ATATACAACA 3240 

TCTGTGTTTG GTGATGATAA TAAGGCGCTT TCTAAAAGTG AAATAATATA TGGAAATQAG 3300 

ACTGAACTGC AAATTCCTTC TTTCAATGA6 ATGGTTTACSC CTTCTGAAAG CACAGTCATG 3360 

CCCAftCATGT ATGATAATGT AAATAAGTT6 AATGCGTCTT TACAAOAAAC CTCTGTTTCC 3420 

ATTTCTAGCA CCAAGG6CAT GTTTCCAGQG TCCCTTGCTC ATACCACCAC TAAGGTTTTT 3480 

GATCAT6AGA TTAGTCAAGT TCCAGAAAAT AACTTTTCAG TTCAACCTAC ACATACTGTC 3540 

TCTCAAGCAT CTGGTGACAC TTCX5CTTAAA CCTGTGCTTA GTGCAAACTC AGAGCCAGCA 3600 

TCCTCTGACC CTGCTTCTAG TGAAATGTTA TCTCCTTCAA CTCAGCTCTT ATTTTATGAG 3660 

ACCTCAGCTT CTTTTAGTAC TQWMSTATTG CTACAACCTT CCTTTCAGGC TTCTGATGTT 3720 

GACACCTTOC TTAAAACTGT TCTTCC»6C?r eTGCCCAGTG ATCCAATATT GGTTGAAACC 3780 

CCCAAAGTTG ATAAAATTAG TTCTACAATG TTGCATCTCA TTGTATCAAA TTCTGCTTCA 3840 

A6TCAAAACA TGCTGCACTC TACATCTGTA CCAGTTTTTG ATGTGTOGCC TACTTCTCAT 3900 

ATGCACTCTG CTTCACTTCA AGGTTTGACC ATTTCCTATG CAAGTGAGAA ATATGAACCA 3960 

GTTTTOTTAA AAAGTGAAAG TTCCCACCAA GTGGTACCTT CTTTGTACAG TAAT6ATGAG 4020 

TTGrrrCCAAA OQGOCAATTT GGAGATTAAC CAGGCCCATC CCCCAAAAGG AAGGCATGTA 4080 

TTTGCTACAC CTGTTTTATC AATTGATGAA CCATTAAATA CACTAATAAA TAAGCT TATA 4140 

CATTCOSATG AAATTTTAAC CTCCACCAAA AGTTCTGTTA CTGGTAAGGT ATTT6CTGGT 4200 

ATTCCAACAG TTGCTTCTGA TACATTTGTA TCTACTGATC ATTCTGTTCX: 7ATAGGAAAT 4260 

GGQCATQTTG CXATTACAGC TGTTTCTCCC CACAGAGATG GTTCTGTAAC CTCAACAAAG 4320 

'n-GCrgTTT C CTTCTAAGGC AACTTCTGAG CTGAGTCATA GTGCCAAATC TGATGCCGGT 4380 

TTAGTGGGTG GTGGTGAAGA TGGTGACACT GATGATGATG 6TGATGATGA TGATGATGAC 4440 

AGAGGTA6TG AT6GCTTATC CATTCATAAG TGTATQTCAT 6CTCATCCTA TAGAGAATCA 4500 

CAGGAAAAGG TAATGAATGA TTCAGACACC CAG6AAAACA 6TCTTATGGA TCAGAATAAT 4560 

CCAATCrCAT ACTCACTATC TGAGAATTCT GAAGAAGATA ATAGAGTCAC AAGTGTATCC 4620 

TCAGACAGTC AAACTGGTAT GGACAGAAGT CCTGGTAAAT CACCATCAGC AAATGGGCTA 4680 

TCCCAAAAGC ACAATGATOG AAAAGAGGAA AAT6ACATTC AQACTGGTAG TGCTCTGCTT 4740 

CCTCTCAGCC CTGAATCTAA AGCAT6G6CA GTTCTGACAA GTGATQAAQA AAGTGGATCA 4800 

GGGCAAGGTA CX:TCA6ATAG CCTTAATGAG AATGAQACTT OCACAGATTT CAGTTTTGCA 4860 

GACACTAATG AAAAAGATGC TGATGGGATC CTGGCAGCAQ GTGACTCAGA AATAACTCCT 4920 

GGATTCCCAC AGTCCCCAAC ATCATCTGTT ACTAG0GA6A ACTCAGAAGT GTTCCAOGTT 4980 

TCAGAGGCAG AGGC2CAGTAA TAGTAGCCAT GAGTCTCXH'A TTGGTCTAGC TGAG GGGTTG 5040 

GAATCCGAGA AGAAOGCftOT TATACOCCTT OTGATCGTGT CAGCCCT6AC TTTTATCTGT 5100 

CTAGTGGTTC TTGTGGOTAT TCTCATCTAC TGGAGGAAAT GCTTCXAGAC TGC ACAC TTT 5160 

TACTTAOAfSG ACAGTACATC CCCTAGAGTT ATATCCACAC CTCCAACACC TATCTTTCCA 5220 

ATTTCAGATG ATGTCQQAGC AATTCCAATA AAGCACTTTC CAAAGCATGT TGCAQATTTA 5280 

CATGCAAGTA 6TGGGTTTAC TQAAQAATTT GAGACACTGA AAGAGTTTTA CCAGGAAGTG 5340 

CAGAGCTGTA CT6TTQACTT AOGTATTACA 6CAGACAGCT CCAACCAOCC ASAGAACAAO 5400 

CACAA6AATC GATACATAAA TAT06TT6CC TAT6ATCATA GCA6GGTTAA GCIAGCACAG 5460 

CTTGCTGAAA AGGATGGCAA ACTGACTGAT TATATCAATG CCAATTATGT TGATGGCTAC 5520 

AACAGACCAA AAGCTTATAT TGCTGCCCAA GGCCCACTGA AATCCACAGC TGAAGATTTC 5580 

TGGA6AATGA 7ATGGGAACA TAATGTGGAA GTTATTGTCA TGATAACAAA CCTCGTGGAG 5640 
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AAAGOAAQGA GAAAATGTGA TCAGTACTGG CCT6C0GATG 66AGTGAGGA GTACGGGAAC 5700 

TTTCTGGTCA CTCAGAAGAG TGTGCAAGTG CTTGCCTATT ATACTQTGAG GAATTTTACT 5760 

CTAAGAAACA CAAAAATAAA AAA6GGCTCC CAGAAAGGAA GACCCAGTGG ACQTGTGGTC 5B20 

ACACAGTATC ACTACACGCA GT6GCCT6AC AT6GGA6TAC GAGAGTACTC CCTGCCAGTG 5880 

CTGACCTTT6 TGAGAAAGGC AGCCTAOIGCC AA6C6CCAT6 CAGTGGGGCC T6TTGTC6TC 5940 

CACTGCAGTG CTC5GAGTTGG AAGAACAGOC ACaTATATTG TGCTAGACAG TATGTTGCAG 6000 

CAGATTCAAC ACQAAGOAAC TGTCAACATA TTTGGCTTCT TAAAACACAT CCGTTCACAA 6060 

A6AAATTATT TGGTACAAAC TGAGGAGCAA TATGTCTTCA TTCATGATAC ACTGGTTGAG 6120 

GCCATACTTA GTAAAGAAAC TGAGGTGCTG GACAGTCA7A TTCATGCXTTA T6TTAATGCA 6180 

CTCCTCATTC CTSGACGAGC AG6CAAAACA AAGCTA6AGA AACAATTCCA GCTCCTGAOC 6240 

CAGTCAAATA TACAGCAOAG TGACTATTCT GCAGCCCTAA AGCAATGCAA CAGGGAAAAG 6300 

AATCQAACTT CTTCTATCAT CCCTGTGGAA AGATCAAGGG TTG6CATTTC ATCCCTGAGT 6360 

GQAGAAGGCA CAGACTACAT CAATGCCTCC TATATCATGG GCTATTACCA GAGCAATGAA 6420 

TTCATCATTA CCCAGCACCC TCTCCTTCAT ACCATCAAGG ATTTCTGGAG GATGATATGG 6480 

GACXATAATG CCCAACTGGT GOTTATGATT GCTGAT6GCC AAAACATGGC AGAAGAIGAA 6540 

TTTGTTTACT GGCCAAATAA AGAT6A0CCT ATAAATTGT6 AGA6CTTTAA G6TCACTCTT 6600 

ATGGCTGAAG AACACAAATG TCTATCTAAT GAGGAAAAAC TTATAATTCA GGACTTTATC 6560 

TTAGAAGCTA CACAGGATGA TTATGTACTT GAAGTGAGQC ACTTTCAGTG TCCTAAATGG 6720 

CCAAATCCAG ATAGCCCCAT TAOTAAAACT TTTGAACTTA TAAGTGTTAT AAAAGAAGAA 6780 

OCTGCCAATA GGGATGG6CC TATQATTGTT CATGATGAGC ATG6AG6AGT GACGGCAGQA 6840 

ACTTTCTGTO CTCTGACAAC CCTTATGCAC CAACTAGAAA AAGAAAATTC OGTGGATGTT 6900 

TACCAGGTAG CCAAGATGAT CAATCTGATG AGGCCAGGAG TCTTTGCTGA CATTGAGCAQ 6960 

TATCAQTTTC TCTACAAAGT GATCCTCAGC CTTGTGAGCA CAAGGCAGGA AGAGAATCCA 7020 

TCCACCTCTC TGGACA6TAA T66TGCAGCA TTGCCTGATG GAAATATAGC TGAGAGCTTA 7080 

GAOTCTTTAG TTTAACACAG AAAGGGQTGG GGGGACTCAC ATCTGAGCAT TGTTTTCCTC 7X40 

TTCCTAAAAT TAGGCAGQAA AATCAGTCTA GTTCTOTTAT CTGTTGATTT CC!CATCACCT 7200 

GACAGTAACT TTCATGACAT AGGATTCTGC CGCCAAATTT ATATCATTAA CAATGTGTGC 7260 

CTTTTTGCAA GACTTGTAAT TTACTTATTA TGTTTGAACT AAAATGATTG AATTTTACAG 7320 

TATTTCTAAG AATGGAATTG TGGTATTTTT TTCTGTATTG ATTTTAACAG AAAATTTCAA 7380 

TTTATAGAGG TTAG6AATTC CAAACTAGAO AAAATOTTTO TTTTTAGTGT CAAATTTTTA 7440 

GCTGTATTTG TAGCAATTAT CAGGTTTGCT AGAAATATAA ClTiTA ATAC AGTAGCCTGT 7500 

AAATAAAACA CTCTTCCATA TOATATTCAA CATTTTACAA CTGCAGTATT CACCTAAAGT 7560 

AGAAATAATC TGTTACTTAT TGTAAATACT GCCCTAGTGT CTCCATGGAC CAAATTTATA 7620 

TTTATAATTG TAGATTTTTA TATTTTACTA CTGAGTCAA6 TTTTCTAGTT CTGTGTAATT 7680 

GTTTAGTTTA ATGACGTAGT TCATTAGCTG GTCTXACTCT ACCAGTTTTC TGACATT6TA 7740 

TTGTGTTACC TAAOTCATTA ACTTTCTTTC A6CA7GTAAT TTTAACTTTT GTGOAAAATA 7800 

GAAATAGCTT CATTTTGAAA GAAGTTTTTA TGAQAATAAC ACXTTTAOCAA ACATTGTTCA 7860 

AATGGTTTTT ATCCAAQGAA TTGCAAAAAT AAATATAAAT ATTGCCATTA AAAAAAAAAA 7920 
AAAAAAAAAA AAAAAAAAAA AAAA 

Seq ZD NO: 180 Protein sequence i 
Protein Accession Bos sequence 

1 11 21 31 41 51 

i 1 I I I I 

MRILKRPLAC IQLLCVCRLD WANGYYRQQR KLVEEIGWSY TGALNOKNWG KKYPTCNSPK 60 

QSPINIOEDL TQVNVNLKKL KFQGHDKTSL ENTFZHNTGK TJSIXILTSDY RVSGGVSBKV 120 

FKA8KITFHW GKCKK88DGS EHSLEGQKFP LEHQIYCFDA DRFSSFEBAV RQKOKLRAXiS 180 

IX<FEVGTEEN LDFKAIIDGV BSVSRFGKQA ALPPFILLNL LPNSTDKTri niGSLTSPPC 240 

TDTVDWIVFK DTVSISESQL AVFCEVLTMQ QSGYVMUOY LQNNPRBQQY KPSRQVPSSY 300 

TGKEEIHEAV CSSEPEHVQA DPENYTSLLV TWERPRWYD TMIEKFAVLY QQIiDGEDQTK 360 

HHFLTDGYQD LGAILKNIiIiP HMSYVLQZVA ICINGIiYGKY S DQLIVD KPT ONPEIiDLFPE 420 

LIGTEEZIKB BBEGKDZBBO AIVNPGRDSA TNQHUCKBPQ ISTTTBYNRZ OTKYHSAKIN 460 

RSFTRGSBFS GKQDVPKTSL NSTSQPVTKL ATBXDZSLTS QrVTELPWr VBGT8A8LND 540 

GSKTVLR5PH MNLSGTAESL NTVSITEYEE ESLLTSFKLD TGAEDSSGSS PATSAIPFZS 600 

ENISQGYIFS SEHPETITYD VLIPESARNA SEDSTSSGSB ESLKDPSMEG NVWFPSSTDI 660 

TAQPDVGSGR ESFLQTKrYTB IRVDBSEKTT XSFSAGFVMS Q6PSVTDLBM PHYSTFAYFP 720 

TEVTPRAFTP SSRQQDLVST VKWYSQTTQ PVyUGBTPKl PSYSSBVPFL VTPIiLUSNQZ 780 

UnrrPAASSS DSALRATPVF PSVDVSFESI IiSSYZX3API«L PFSSASFSSB IiFRBLHTVSQ 840 

ILPQVTSATB SDKVPLHASL PVAGGDLLLE PSLAQYSDVL STTHAASETL CFGSESGVLY 900 

KTLMPSQVEP PSSDAMMHAR SSGPEPSYAL SDKEGSQHIF TVSYSSAIPV HDSVGVTYQG 960 

SLPSGPSBIP IPKSSIilTPT ASL^PTHAL SGDGBWSGAS SDSEFIiLPDT DQLTAIHISS 1020 

PVSVABFTSrr TSVFGDDHKA bSKSEZZYGN ETBLQXPSFN BHVYPSESTV MPNMroHVNK 1080 

Z17ASLQETSV SZSSTK£a4FP GSIiABTTTKV FDHEZSC3IVPE HIIFSVQFTBT VSQAS(3)TSL 1140 

RPVLSANSEP ASSDPASSEK ItSPSTOI<IiFY ETSASFSTEV ULiQPSFQASD VDTIiLKTVLP 1200 

AVPSDPILVE TPKVDKISST MLHIiIVSNSA SSENMLHSTS VPVFDVSPTS HMHSASLQGL 1260 

TZSYASEKYE FVLLKSESSH QWPSLYSND ELFQTANIiEZ NQAHPPK6RH VFATPVXiSIP 1320 

EPIdtTXiZNKL IKSDBXLT8T K8SVTGKVFA GIPTVASDTF VSTDH8VPI6 NGKVAITAVS 1380 

PHRDGSVTST KZiLFPSKATS ELSRSAKSDA 6LVGG6BIX3D TDSDGDin30D DRGSDQLSIB 1440 

KCMSCSSYRE SQBKVMNDSD THENSLMDQN NPISYSLSBN SEBDNRVTSV SSDSQTGMDR 1500 

SP6KSFSANG LSQKKNDGKE ENDIQTGSAL LPLSPESKAW AVLT8DEESG SGQGTSDSLN 1560 

EHETSTDFSF ADTNEKDADG 1LAAGD5EZT PGFPQSPTSS VTSENSEVFH VSEAEASNSS 1620 

HBSRZGLABG LBSBKKAVXP LVXVSALTFZ CLWLVGILI YHRKCFQTAH FYLEDSTSPR 1680 

VISTPPTPXF PISDDVGAIP XSMFPKHVAD tHASSGFTBE PETtKEFYOE VQSCTVDLGI 1740 

TADSSKHPDN KKKNRYINIV AYDHSRVKLA QLAEKDQKLT DYINAtTYVDG YNRPKAYIAA 1600 

QGPZtKSTAED FWRNIHEHNV EVXVMZTNLV EKGRRKCDQY WPADGSEEYG NFLVTQKSVQ 1860 

VLAYYTVRNP TLRNTKIKKG SQKGRPSGKV VTQYHYTQWP DMQVPEYSI.P VLTFVRKAAY 1920 

AKRHAVGPW VHCSAGVGRT GTYIVLDSML QQIQHEGTVN IFGFLKHIRS QRNYLVCJTEB 1980 

QYVFIHDTLV EAILSKETEV IiDSHXHAYVN ALLIPGPAGK TKX*EKQFQLL SQSMIQQSDY 2040 

SAALRQCNRE KNRTSSIZFV ERSRVGZSSL SGEGTDYZNA SYZMGYYQSN EFXITQHPLL 2100 

HTIKDFWRMX MDHNAQIiWM IPDGQNMAED EFVYHPNKDB PZNCESFKVT LHAEEHKCLS 2160 

NEBKLIIQDF IIiBATQDDYV liEVRHFQCPK WPNPDSPISK TFBLXSVIXS BAANRDOPHX 2220 

VHDBK6GVTA OTFCALTTLM BQZiEKBNSVD VYQVAXNXNL NRPGVFADZE QYQFLYK\aL 2280 
SLVSTRQBEN FSTSUISNaA AliPDCSIIAES LBSLV 

Seq ZD NO: 181 DNA sequence 

Nucleic Acid Accession ft: Bos sequence 
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Coding sequence: 148-4518 

1 11 21 

I I I 

C3U3U31TAOS CAGGCACGAT CTOVCTTCGfl 
CWWAAAAAC ATTTCCTTOS CTCCCCCTCC 
06G06A6GGG COSCAGACOG TCTGGAAATG 
CAGCTCCTCT GTGTTTGCCG CCTGGATTGG 
CTTGTTCAAG AGATTGGCTG GTCCTATACA 
AAATATCCAA CATGTAATA6 GCCAAAACAA 
CAAGTAAATG T6AATCTTAA GAAACTTAAA 
AACACATTCA TTCATAACAC TGQQAAAACA 
QTCAGCGGAG GAGTTTCAGA AATGGTGTTT 
AAATGCAATA TQTCATCTQA TGOATCAGAG 
OAOATSCAAA TCTACTGCTT TOATO0G6AC 
GSAAAAG6GA AOTTAAGAGC TTTATCCATT 
OATTTCAAAG CX3ATTATTGA TGGAGTCX3AA 
TTAGATCCAT TCATACTGTT GAACCTTCTG 
AATG6CTCAT TQACATCTCC TCCCTGCACA 
ACA0TTAQCA TCTCIGAAAG 0CA6TTGGCT 
TCTGGTTATG TCATGCTGAT 6GACTACTTA 
TTCTCTAGAC AGGTGTTTTC CTCATACACT 
AGTTCAGAAC CAGAAAATGT TCAGGCTGAC 
TGGGAAAGAC CT03AGT0GT TTATGATACC 
CAGTTGGATO GAGAOGACCA AACCAAGCAT 
G6T6CTATTC TCAATAATTT 6CTACCCAAT 
TGCACTAATO GCTTATATGG AAAATACAGC 
AATCXTOVAC TTGATCTTTT CCCTQAATTA 
GAAGAG6GAA AAGACATT6A AGAAGGC6CT 
AACOUUiTCA GGAAAAAGGA ACCCCAGATT 
AOQAAATACA ATGAAGCCAA 6ACTAACCGA 
AAG6GTGATG TTCCCAATAC ATCTTTAAAT 
ACAGAAAAAG ATATTTCCTT GACTTCTCA6 
GAAGGTACTT C3VGCCTCTTT AAATGATGGC 
AACTTGTCX3G GGACTGCAGA ATCCTTAAAT 
A6TTTATTGA CCAGTTTCAA GCTTGATACT 
GCAACTTCT6 CTATCCCATT CATCTCTQAG 
GAAAACCXy^G AGACAATAAC ATAT6ATGTC 
GAAGATTCAA CTTCATCAGO TTCAQAAGAA 
GTGTGGTTTC CTAGCTCTAC AGACATAACA 
AGCTTTCrCC AGACTAATTA CACTGAGATA 
TCCTTTTCTG CAQGCCCAGT GATGTCACAG 
CATTATTCTA CCTTTGCCTA CTTOGCAACT 
TCCAGACAAC AGGATTTGGT CTCCAOGGTC 
GTATACAATQ CAGAGGCCAG TAATAGTAGC 
TTGGAATCCG AGAAGAAGGC AGTTATACCC 
TGTCTAGTGG TTCTTGTGGG TAT TCTC ATC 
TTTTACTTAO AGGACAGTAC ATCOCCTAGA 
CCAATTTCA6 AT6ATGT0G6 A6CAATTCCA 
TTACATGCAA GTAGTGGGTT TACTQAAGAA 
GTGCAGAGCT GTACTGTTGA CTTAGGTATT 
AA6CACAAGA ATGGATACAT AAATATOGTT 
CAGCTTGCTO AAAAG6ATG0 CiAAACTGACT 
TACAACAGAC CAAAAGCTTA TATTGCTOCC 
TTCTGGAGAA TGATATQGGA ACATAATGTG 
GAGAAAGGAA GGAGAAAATG TGATCAGTAC 
AACTTTCTGO TCACTCAQAA GA6TGTGCAA 
ACTCIAAOAA ACACAAAAAT AAAAAA6GGC 
GTCACACAGT ATCACTACAC GCAGTGGCCT 
GTGCTGACCT TTGTGAGAAA GGCAGCCTAT 
GTCCACTGCA GTGCTGGAGT TGGAAGAACA 
CAGCAGATTC AAGACGAAGG AACTGTCAAC 
CAAAOAAATT ATTT60TACA AACTOAGGAa 
GAGGCCATAC TTAGTAAAGA AACT6AGGTG 
GCACTCCTCA TTCCTGGACC AGCAOGCAAA 
AGCCAGTCAA ATATACAGCA GAGTGACTAT 
AAGAATOQAA CTTCTTCTAT CATCCCTGTG 
AGTOOAGAAG QCACAOACTA CATCAATOCC 
QAATTCATCA TTACCCAOCA OCCTCTCCTT 
TGGGACCATA ATGCCCAACT GGTGGTTATG 
GAATTTGTTT ACTGGCCAAA TAAAGATGAG 
CTTATGGCTG AAGAACACAA ATGTCTATCT 
ATCTTA6AAG CTACACAGGA TGATTATGTA 
TO60CAAATC CAGATAGCCC CATTA6TAAA 
GAAGCTGCCA ATAGGGATGG GCCTATGATT 
GGAACTTTCT GTGCTCTGAC AACCCTTATO 
GTTTACCAQQ TAGCCAAGAT 6ATCAATCTG 
CAQTATCAGT TTCTCTACAA AGTGATCCTC 
CCATCCACCT CTCTGGACAQ TAATGGTGCA 
TTAGAGTCTT TAGTTTAACA CAGAAAGGGG 
CTCTTCCTAA AATTAGGCAG GAAAATCAGT 
CCTGACAGTA ACTTTCATGA CATAGGATTC 
TGCCTTTTTG CAAGACTTGT AATTTACTTA 
CAGTATTTCT AAGAATGGAA TTGTGGTATT 
CAATTTATAG AGGTTAG6AA TTCCAAACTA 
TTAGCTGTAT TTGTAGCAAT TATCAGGTTT 
TGTAAATAAA ACACTCTTCC ATAT6ATATT 



31 41 SI 

i I I 

TCTATACACT G6A6GATTAA AACAAACAAA 60 

CTCTCCACTC TGA6AAGCAG A6QAGC06CA 120 

OGAATCCTAA AGOGTTTCCT OGCTTGCATT 180 

6CTAATGGAT ACTACA6ACA ACAGAGAAAA 240 

GGAGCACTGA ATCAAAAAAA TTGGGGAAAG 300 

TCTCCTATCA ATATTGATOA AOATCTTACA 360 

TTTCAG60TT GGQATAAAAC ATCATTGGAA 420 

OTOGAAATTA ATCTCACTAA TGACTAC06T 480 

AAAGCAAGCA AGATAACTTT TCACTGGGGA 540 

CATAGTTTAG AAGQACAAAA ATTTGCACTT 600 

G6ATTTTCAA GrTTTGAGOA AGCAGTCAAA 660 

TTGTTTGAOG TTGGOACSUSA AGAAAATTTG 720 

AGTGTTAGTC GTTTTGGGAA 6CAGGCTGCT 780 

CCAAACTCAA CTGACAAGTA TTACATTTAC 840 

GAGACAGTTG ACTGGATTGT TTTTAAAGAT 900 

GTTTTTTGTG AAGTTCTTAC AATGCAACAA 960 

CAAAACAATT TTOGAGAGCA ACAGTACAA6 1020 

GGAAAGGAAG AGATTCATGA AGCAGTTTQT 1080 

CCAGAGAATT ATACCAGCCT TCTTGTTACA 1140 

ATGATTGAQA AGTTTGCAlt?r TTTGTACCAQ 1200 

GAATTTTTGA CAGATQGCTA TCAAGACTTO 1260 

ATOAGTTATG TTCTTCAGAT AGTAGCCATA 1320 

GACCAACTGA TTGTCGACAT GCCTACTQAT 1380 

ATTGGAACTG AAGAAATAAT CAAGGAGGAG 1440 

ATT6TGAATC CTGGTAGAGA CAGTGCTACA 1500 

TCTACCACAA CACACTACAA TCGCATAGGG 1560 

TCCCCAACAA GAG6AA6TGA ATTCTCTGOA 1620 

TCCACTTCCC AACCAGTCAC TAAATTAOCC 1680 

ACTGTGACTG AACTGCCAOC TCACACT8TS 1740 

TCTAAAACTG TTCTTAQATC TCCACATATG ' 1800 

ACAGTTTCTA TAACAGAATA TGAGGAGGAG 1860 

GGAGCTGAAG ATTCTTCAGG CTCCAGTCCC 1920 

AACATATCOC AAGG6TATAT ATTTTCCTCC 1980 

CTTATACCAO AATCTGCTAG AAATGCTTGC 2040 

TCACTAAAGG ATCCTTGTAT GGAGG8AAAT 2100 

GCACAGCC06 ATOTTGGATC AGGCAGAGAG 2160 

CGTGTtGATG AATCTGAGAA GACAACCAAG 2220 

GGTCCCTCAG TTACAGATCT GGAAATGCCA 2280 

GAGGTAACAC CTCATGCTTT TACCSOCATCC 2340 

AA0GTG6TAT ACTOGCAGAC AACCCAACOG 2400 

CATGAGTCTC GTATTGGTCT AGCTGA6GGG 2460 

CTTGTGATOG TGTCAGCCCT GACPTTTATC 2520 

TACTGGAGQA AAT6CTTCCA GACTGCACAC 2580 

GTTATATCCA CACCTCCAAC AOCIATCTTT 2640 

ATAAAGCACT TTCXAAAGCA TGTT6CAGAT 2700 

TTTGAGACAC TGAAAGAGTT TTACCAGGAA 2760 

ACAGCAOACA GCTCCAACCA CCCAGACAAC 2820 

GCCTATGATC ATAGCAGG6T TAAGCTAGCA 2880 

GATTATATCA ATOCCAATTA TGTTGATGGC 2940 

CAA86CCCAC TGAAATCCAC AGCTGAA6AT 3000 

GAAGTTATTG TCAT6ATAAC AAACCTCGTG 3060 

TGGCCTGCCG ATGGGAGTGA GGAGTACGGG 3120 

GTGCTTGCCT ATTATACTGT GAGGAATTTT 3180 

T0CCAGAAA6 GAAGACCCAG TGGAOGTGTG 3240 

GACATGGGAG TACCAGAGTA CTCOCTGCCA 3300 

GCCAAGCGCC ATGCAGTQGG GCCTGTTGTC 3360 

GGCACATATA TTGTGCTAGA CAGTATGTTG 3420 

ATATTT6GCT TCTTAAAACA CATOCOTTCA 3480 

CAATATGTCT TCATTGATOA TACACTGGTT 3540 

CT66ACAGTC ATATTCATGC CTATGTTAAT 3600 

ACAAAGCTAG AGAAACAATT CCAGCTCCT6 3660 

TCTGCAGCCC TAAAGCAATG CAACAGGGAA 3720 

GAAAGATCAA GGGTTQGCAT TTCATCCCTG 3780 

TOCCATATCA TGGGCTATTA CCAGAGCAAT 3840 

CATACCATCA AGGATTTCT6 GAGGATGATA 3900 

AITCCTGATG GCCAAAACAT GGCAGAAGAT 3960 

CCTATAAATT GTGAGA6CTT TAAGGTCACT 4020 

AATGAGGAAA AACTTATAAT TCAQGACTTT 4080 

CTTGAAQTGA GGCACTTTCA GTGTCCTAAA 4140 

ACTTTTGAAC TTATAAGTGT TATAAAAGAA 4200 

GTTCATGATG AGCATGGAGG AGT6A066CA 4260 

CACCAACTAG AAAAAGAAAA TTCOSTGGAT 4320 

ATGAGGCCAG GAGTCTTTGC TGACATTGAO 4380 

AGCCTTGTGA GCACAAGGCA GGAAGAGAAT 4440 

GCATT6CCXG ATGGAAATAT AGCTGAGAGC 4500 

TGGGGGGACT CACATCTGAG CATTGTTrrC 4560 

CTAGTrCTGT TATCIGfTTGA TTTCOCATCA 4620 

TGC06CCAAA TTTATATCAT TAACAATGTG 4680 

TTATGTTTGA ACTAAAATGA TTGAATTTTA 4740 

TTTTTCTGTA TTGATTTTAA CAGAAAATTT 4800 

CAGAAAATGT TTGTTTTTAO TGTCAAATTT 4860 

GCTAOAAATA TAACTTTTAA TACAGTAGCX: 4920 

CAACATTTTA CAACTGCAST ATTCACCTAA 4980 
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AOTAGAAATA ATCTOTTACT 
ATATTTAIKA TTBTAGATTT 
ATTBTTTAGT TTAATGAOST 
QTATTGTQTT ACCTAAGTCA 
ATAQAAATAC CTTCATTTTC 
TCAAATQGTT TTTATCCAAG 
AAAAAAASAA AAAAAAAAAA 



TATTOTAAAT ACTGCCCTAG leTCTCCATG GACCAAATTT 5040 

TTATAXTTTA CTACTOAOTC AASITTTCIA 6TTCIBTOTA 5100 

ASTTCATTA6 CTG6TCTTAC TCIACCA6TT TTCTGAaTT 5160 

TTARCTTTGT TXCAGCATGT AATTTTAACT TTTGTGGAAA 5220 

AAAGAAGTTT TTAIGAGAAT AACACCTTAC CAAACAITGT 5280 

GAATTGCAAA AATAAATATA AATATTGCCA TTAAAAAAAA 5340 
AAAAAAA 



PCTAJS02/12476 
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Seq ID NO: 182 Protein sequence; 
Protein Accession #: Eos sequence 



1 
I 

MRILKRFLAC 
QSPINIDEDL 
FKASKITFHW 
ILFBV6TEBN 
TDTVDWIVPK 
TGKEEIHEAV 
KEFLTDGYQD 
LI6TEEIIKB 



II 



IQIiLCVCRLD 
TQVNVNIiKKL 



6SKTVLRSPH 
BNZ8QGYIFS 
TAQP0VGS6R 
TEVTPHAPTP 
PLVIVSALTF 
PIKHPPKHVA 
VAYDBSRVKL 
VEVIVMITEJL 



TGTYIVLDSM 
VLDSHZEAYV 
VERSRVGISS 
MIPDGQNNAE 
VLEVRHFQCP 
MHQLEKENSV 
AALPD6NIAE 



LDFKAIIDGV 
DTVSISESQL 
CSSEPENVQA 
LGAIIiNNLLP 
EBEGKDIEEG 
GKGDVFNTSL 
MNLSGTAESL 
SENPETITYD 
ESFLQTNYTE 
SSRQQDLVST 
IdiWLVGIL 
DZiHASSGFTE 
AQLAEXDOKL 
VBKGRRKCDQ 
WTQYHYTQW 
LQQIQHEOTV 
NALLIPGPAG 
LSGEGTDYm 
DEFVYHFNKD 
KWPNP0SPI8 
DVYQVAKMIN 
SLESLV 



21 
I 

WANGYYRQQR 
KFQGHDKTSL 
EHSLEQQXFP 
ESVSRFGKQA 
AVFCEVLTMQ 
DPENYTSLLV 
NMSYVLQIVA 
AIWPGRDSA 
NSTSQPVTKL 
NTVSITEYEE 
VLIPESARNA 
IRVDESEKTT 
VNWYSQTTQ 
lyWRKCFQTA 
BFETLKEFYQ 
IDYIKANYVD 
YWPAOGSEBY 
PDMGVPEYSL 
NIFGFLKHIR 
KTKLBKQFQL 

ASYiMGnnrQs 

EPIKCESFKV 
KIFELZSVIK 
LMRPGVFADI 



31 



KIiVEEIGWSY 
ENTFZHNTGK 
LBKQIYCFDA 
AIiDFPIIiUIL 
QSGYVMLMDY 
TWERPRWYD 
ICTNGLYGKY 
TNQIRKKEPQ 
ATSKDISLTS 
ESLLTSFKLD 



41 

I 

TGALNQKNWG 
TVBINI.TNDY 
DRFSSFEEAV 
LPNSTDKmri 
LQMHFREQQY 
TMIEKFAVLY 
SDQLIVDMOT 
ISTTTHYNRI 
QTVTBLPFBT 



51 



KKYPTCHSPK 
RVSGGVSEMV 



KSFSAGPVMS 
FVYNABASNS 
HPYLEDSTSP 
EVQSCTVDLG 
GYNRPKAYIA 
GNPLVTQKSV 
PVLTFVRKAA 
SQRNYLVQTE 
IiSQSNIQQSD 
MBFinOHPL 
TLMAKEHKCL 
BBAA11XU30PM 
EQYQPLYKVI 



ESLKDPSHBG 
QGPSVTDLEM 
SHESRIGIiAE 
RVISTPPTPI 
ITADSSNHPD 
AQGPIiKSTAE 
QVLAYYTVRN 
YAKRHAVGPV 
EQYVFIHDTL 
YSAALKQCNR 
LBTIKDFHRN 
SNEBKIiIZQD 
rVHDEHQGVT 
IiSLVSTRQES 



YKGSI>T9FPC 
KFSRQVFSSY 
QQIjDGEDQTK 
DNPELDLFPE 
GTlCmEAKZN 
VEGTSASLND 
PATSAZPFIS 
NVWFPSSTDI 
PHYSTFAYFP 
GLESEKKAVI 
PPISDDVGAI 
NKHKNRYINI 
DFWRMIWEHN 
FTLHNTKIKK 
WHCSAGVGR 
VEAILSKETE 
EKNRTSSZZP 
INDHNAQLW 

fzz^i:qddy 

AGTFCALTTL 



8eg ZD NOt 183 DEIA sequence 

Nucleic Acid Accession #i BOS sequence 

Coding sequence t 148-4494 



1 
I 

CACACATACX3 
CAAAAAAAAC 
CGGCGAGGGG 
CAGCTCCTCT 
CTTGTTGAAG 
AAATATCCAA 
CAAGTAAATQ 
AACACATTCA 
6TCAG0GGAG 
AAATGCAATA 
GftGATGGAAA 
GGAAAAGG6A 
GATTTCAAAG 
TTA6ATCCAT 
AATGGCTCAT 
ACAGTTAGCA 
TCIGOTTATG 
TTCTCTAGAC 
AGTTCA6AAC 
TGGGAAAGAC 
CAGTTOGATG 
GGTGCTATTC 
TGCACTAATG 
AATCCT6AAC 
6AAGAGGGAA 
AACCAAATCA 
ACGAAATACA 
AAGGGTGATG 
ACA6AAAAAG 
6AAG6TACTT 
AACTTGTCGG 
AGTTTATTGA 
GCAACTTCT6 
SAAAACCCAO 
6AAGATTCAA 
GTGTGGTTTC 
AGCTTTCTCC 
TCCTTTTCTG 
C3VTTATTCTA 
TCCAGACAAC 
GTATACAATG 



11 

I 

CACGCACGAT 
ATTTCCTTCG 
CCGCAGACCG 
GTGTTTGCCG 
AGATTGGCTG 
CAT6TAATAG 
TGAATCTTAA 
TTCATAACAC 
GAGTTTCAGA 
TGTCATCTGA 
TCTACTGCTT 
AGTTAAGAGC 
CGATTATTGA 
TCATACTGTT 
TGACATCTCC 
TCTCT6AAAG 
TCATGCTOAT 
AGGTGTTTTC 
CA6AAAATGT 
CTCGAGTCGT 
GAGAGGAOCA 
TCAATAATTT 
GCTTATATGG 
TTQATCTTTT 
AAGACATT6A 
GGAAAAA6GA 
ATGAA6CGAA 
TTCCCAATAC 
ATATTTCCTT 
CA6CCTCTTT 
GGACTGCAGA 
CCAGTTTCAA 
CTATCCCATT 
AGACAATAAC 
CTTCATCAGG 
CTA6CTCTAC 
AGACTAATTA 
CAGGCCCAGT 
CCTTTGCCTA 
AGGATTTGGT 
AG6CCAQTAA 



21 

I 

CTCACTTOGA 
CTCCCCCTCC 
TCTGQAAATG 
CCTGGATTGG 
GTCCTATACA 
CCCAAAACAA 
GAAACTTAAA 
TGGGAAAACA 
AATGGTGTTT 
TGGATCAGAG 
TGATGCAGAC 
TTTATCCATT 
TGGAGTOGAA 
GAACCTTCTG 
TCCCTGCACA 
CCAGTTGGCT 
GGACTACTTA 
CTCATACACT 
TCAGGCTGAC 
TTATGATACC 
AACCAAGCAT 
GCTACCXAAT 
AAAATACAGC 
CCCTGAATTA 
AGAAGGCGCT 
ACCCCAGATT 
GACTAACCX3A 
ATCTTTAAAT 
GACTTCTCAG 
AAATGATGGC 
ATCCTTAAAT 
GCTTGATACT 
CATCrCTGAG 
ATATGAT6TC 
TTCAGAA6AA 
AGACATAACA 
CACTGAGATA 
GATGTCACAG 
CTTCCCAACT 
CTCCACGGTC 
TAGTAGGCAX 



31 
I 

TCTATACACT 
CTCTCCACTC 
CGAATCCTAA 
GCTAATGOAT 
GQAGCACTGA 
TCTCCTATCA 
TTTCAGGGTT 
GTGGAAATTA 
AAAGCAAGCA 
CATi^TTTAG 
CGATTTTCAA 
rTGTTTQAGG 
A6T6TTAGTC 
CCAAACTCAA 
GACACAGTTG 
GTTTTTTGTQ 
CAAAACAATT 
GGAAAGGAAG 
CCAGAGAATT 
ATGATTGAGA 
GAATTTTTGA 
ATGAGTTATG 
GACCAACTGA 
ATTGGAACTG 
ATTGTGAATC 
TCTACCACAA 
TCCCCAACAA 
TCCACTTCCC 
ACTGTGACTG 
TCTAAAACTG 
ACAGTTTCTA 
GGAGCTGAAG 
AACATATCCX: 
CTTATACCAG 

tcactaaagg 
gcacagccx:g 

CX3TGTTGATG 
GQTCCCTCAG 
GAGGTAACAC 
AAGGTGGTAT 
GAGTCT08TA 



41 
I 

6QAGGATTAA 
TGAGAA6CAG 
AG06TTTCCT 
ACTACAGACA 
ATCAAAAAAA 
ATATTGAT6A 
GGGATAAAAC 
ATCTCACTAA 
AGATAACTTT 
AAGGACAAAA 
GTTTTGAGGA 
TTGGGACAGA 
GTTTTGGGAA 
CTGACAAGTA 
ACTGGATTGT 
AAGTTCTTAC 
TTCQAQAGCA 
AGATTCATGA 
ATACCAGCCT 
AGTTTGCAGT 
CAGATGGCTA 
TTCTTCAGAT 
TTGTOGACAT 
AAGAAATAAT 
CTGGTA6AGA 
CACACTACAA 
GAGGAAGTGA 
AACXAGTCAC 
AACTGCCACC 
TTCTTAGATC 
TAACAGAATA 
ATTCTTCAGG 
AAGGGTATAT 
AATCTGCTAG 
ATCCTTCTAT 
ATGTTGGATC 
AATCTGAGAA 
TTACAQATCr 
CTCAXGCTTT 
ACTGGCAGAC 
TT6GTCTAGC 



51 
I 

AACAAACAAA 
AGGAGCCGCA 
C6CTTGCATT 



TTGGGGAAAG 
AGATCTTACA 
ATCATTGGAA 
TGACTACOGT 
TCACTGGQQA 
AT7TCCACTT 
AGCAGTC3U\A 
AGAAAATTTG 
GCAGGCTGCT 
TTACATTTAC 
TTTTAAAGAT 
AATGCAACAA 
ACAGTACAAG 
AGCAGTTTGT 
TCTTGTTACA 
TTTGTACCAG 
TCAAGACTTG 
AGTAGOCATA 
6CCTACTGAT 
CAAGGAGGA6 
CAQTGCTACA 
TGGCATAGGG 
ATTCTCrGUA 
TAAA7TAGCC 
TCACACTGTG 
TCCACATATG 
TGAGGAGGAG 
CTCCAGTCCC 
ATTTTCCTCC 
AAATGCTTCC 
GGAGGGAAAT 
AGGCAGAGAG 
GACAACCAAG 
GGAAATGCCA 
TACCCCATOC 
AAOCCAACOG 
T6AGG6GTTG 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
640 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 



259 



5 

10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 



WO 02/086443 

QAATCG6AGA AGAAGGCAGT TATACX!CCTT C5TGATCGTGT 
CTAGTGGTTC TT6IGGGTAT TCTCATCTAC TGGAGGAAAT 
TACTTAGAGG ACAGTACATC CCCTAGAGTT ATATCCACAC 
ATTTCAGATG ATGTC3GGAGC AATTCCAATA AAGCACTTTC 
CAT6CAAGTA GTGGGTTTAC TGAAGAATTT GAGGAAGTOC 
GGTATTACA6 CAGACAjGCTC CAACXACCCA GACAACAAGC 
ATOGTTGCCT ATGATCATAG GAOGGTTAAG CTA6CACAGC 
CTGACT6ATT ATATCAATGC C3^TTATGTT GATGQCTACA 
GCTGCCCAAG GCCCACTQAA ATCCACAGCT GAAGATTTCT 
AATGT6GAA0 TTATrGTCAT GATAACAAAC CTCGTGGAGA 

cagtactggc ctgcogatgg gagtgaggag taosggaact 
gtgcaagtgc ttgcctatta tactgtgagg aattttactc 
aaggoctccc agaaaggaag acccagtgga cgtgtggtca 
tggcctgaca tggqa6tacc agagtactcc ctgccagtgc 

GCCTATGCCA AGCGCCATGC AGTGGGGCCT GTTGTCGTCC 
AGAACAGGCA GATATATTCT 6CTAGACAGT ATGTTGCAGC 
GTCAACATAT TTGGCTTCTT AAAACACATC C6TTCACAAA 
GAGGAGCAAT ATGTCTTCAT TCATGATACA CTGGTTQAGG 
GAGGTGCTGG ACAGTCATAT TCATGCCTAT GTTAAT6CAC 
GGCAAAACAA AGCTA6AQAA ACAATTCCAG CtGCtGAOCC 
GACTATTCTG CAGCCCTAAA GCAATGCAAC AGGGAAAAQA 
CCTGTGGAAA GATCAAGGGT TGGCATTTCA TCCCTGAGTG 
AATGCCTCCT ATATCAtOGG CTATTACCAG AGCAATGAAT 
CTCCTTCATA CCATCAAGGA TTTCTGGAGG ATGATATGGG 
GTTATGATTC CTGATGGCCA AAACATGGCA GAAGAT6AAT 
GATQAGCCTA TAAATTGTGA GAGCTTTAAG GTCACTCTTA 
CTATCTAATG AGGAAAAACT TATAATTCAG GACTTTATCT 
TATGTACTTG AAGTGAGGCA CTTTCAGTGT CCTAAATGGC 
AGTAAAACTT TT6AACTTAT AA6TGTTATA AAAGAAGAA6 
ATGATTGTTC ATGATGAGCA TGOAGQASTO AOaGC aQaAA 
CTTATGCACC AACTAGAAAA AGAAAATTGC GTCGAXGrTT 
AATCTGATGA GGCCAGGAGT CTTTGCTGAC ATTGA6CA6T 
ATCCTCAGCC TTGTGA6CAC AAGGCAGGAA GAGAATCCAT 
GGTOCAGCAT TGGCTGATQG AAATATAOCT GAGAGCTTAG 
AAG6GQTG6G GGGACTCACA TCTOAjSCATT tfnTl ' CCiX.T 
ATCA6TCTAG TTCTGTTATC TGTTGATTTC CCATCACCTG 
GGATTCTGCC GCCAAATTTA TATCATTAAC AATGTGTGCC 
TACTTATTAT GTTTQAACTA AAATGATTGA ATTTTACAGT 
GGTATTTTTT TCTGTATTGA TTTTAACAGA AAATTTCAAT 
AAACTACAGA AAATGTTTGT TTTTAGTGTC AAATTTTTAG 
AGGTTTGCTA 6AAATATAAC TTTTAATACA GTAGCCTGTA 
GATArrCAAC ATTTTACAAC TGCAGTATTC ACCTAAAflTA 
GTAAATACTG CCCTAGTGTC TCCATGGACC AAATTTATAT 
ATTTTACTAC TGAGTCAAGT TTTCTAGTTC TGTGTAATTG 
CATTAGCTGG TCTTACTCTA CCAOTTTTCT GACATTOTAT 
C7TTQTTTCA GCATGTAATT TTAACTTTTO TGGAAAATAG 
AAGTTTTTAT GAGAATAACA CCTTAOCAAA CATTGTTCAA 
T6CAAAAATA AATATAAATA TTGCCATTAA AAAAAAAAAA 
AAA 

Seq ID NOi 184 Protein sequence: 
Protein Accession #: BOS sequence 



PCT/US02/12476 



1 
I 

MRIIiKRFIiAC 
Q8PINIDBDL 
FKASKITFHW 
ILFBVGTBKH 
TDTVDWIVFK 
TGKEEIHEAV 
HBFLTDGYQD 
LIGTEEIIKE 
RSPTRGSEFS 
GSKTVIiRSPH 
EHISQGYIFS 
TAQPDVGSGR 
TEVTPHAFTP 
LVIVSALTFI 
IKHFPKHVAD 
KLAQLAEKDG 
NLVGKGRRKC 
GRWTQYHYT 
8ML0QIQHEG 
YVNALLIPGP 
SSL5GEGTDY 
AEDSFVYHPII 
CPRHPNFDSF 
SVDVYCJVAKM 
AE8LESLV 



11 
I 

IQIiLCVCRLD 
TQVNVNUCKL 
GKCNHSSDGS 
LDFKAIIDGV 
DTVSISBSQL 
CSSEPSIVQA 
LGAILNNLLP 



31 



NANOyyRQQR 
KPQGWDKTSL 



GKBDVP23TSL 
HNLS6TABSL 
SENPBTITYD 
BSFLQTNYTB 
SSRQQDLVST 
CLWLVGILI 
LHASSGPTEE 
KLTOYINANY 
DQYHPADGSB 
QWPDMGVPEY 
TVNIPGFLKH 
AGKTKLEKQF 
INASYIMGYY 



XSKrFBIiZSV 
ZNLHRPGVFA 



ESV5RFGKQA 
AVPCEVLTMO 
DPBNYTSLLV 
NMSYVLQIVA 
AZVNPGSDSA 
MSTSQPVTKL 
NTVSITBYEE 
VIiIPESARKA 
IRVDESBKTT 
VNWYSQTTQ 
YWRKCFQTAH 
FEBVQSCTVD 
VDGYNRPKAY 
EYGNPLVTQK 
SLPVLTFVRK 
IRSQRNYLVQ 
QLLSQS17IQQ 
QSNEFIITQB 
KVTLMABEHK 
IKEBAANRDG 
DIBQYQFLYK 



KLVEBZGHSY 
ENTFZKNTGK 
LEMQZYCFDA 
ALOPFZLLNL 
QSGYVHLKDY 
TWBRPRWYD 
ICTNGLYGKY 
TNQZRKKEPQ 
ATEXDZSLTS 
ESLLTSPKLD 
SEDSTSSGSE 
RSFSAOPVMS 
PVYNBASHSS 
FYLEDSTSPR 
LOZTADSSNH 
ZAAQGPLKST 
SVQVLAYYTV 
AAYAKRHAVG 
TBEQYVFZHD 
SDYSAALKQC 
PLLHTZKDFW 
CLSNHEKIiZI 
PMZVHDEHGG 
VZliSLVSTRQ 



CAGCCCTGAC 
GCTTCCAGAC 
CTCCAACACC 
CAAAGCATGT 
AGAGCTGTAC 
ACAAGAATCQ 
TTGCTGAAAA 
ACAGACCAAA 
GGAGAATXSAT 
AAGGAAOGAG 
TTCTGGTCAC 
TAAGAAACAC 
CACAGTATCA 
TOACCTTTGT 
ACTGCAGTGC 
AGATTCAACA 
GAAATTATTT 
CCATACTTAG 
TCCTCATTCC 
AOTCAAATAT 
ATOQAACTTC 
GAGAAGGCAC 
TCATCATTAC 
ACCATAATGC 
TIOTTTACTG 
T66CTGAA6A 
TAGAAGCTAC 
CAAATCCAGA 
CIGCCAATAG 
CTTTCT6TQC 
ACCAGGTAGC 
ATCAGTTTCT 
CCACCTCTCT 
AGTCTTTAGT 
TOCTAAAATT 
ACAGTAACTT 
TTTTTGCAAG 
ATTTCTAAGA 
TTATAGAGGT 
CTGTATTTGT 
AATAAAACAC 
GAAATAATCT 
TTATAATTGT 
TTTAGTTTAA 
TGTGTTACCT 
AAATACCTTC 
ATGGTTTTTA 
AAAAAAAAAA 



41 
I 

TGAIiNQKNWG 
TVEZNLTHDY 
DRFSSFEEAV 
LPHSTDKYYI 
LQNNFREQQY 
TKIEKFAVLY 
SDQI»XVDMPT 
ZSTTTHYHRZ 
QTVTELPPHT 
TGAEDSSGSS 
ESLKDFSMEG 
QGPSVTDLEM 
RBSRIGIiASQ 
VZSTPPTPZF 
PDNKHXKRYI 
AEDFWRMZWB 
RNFniRNTKZ 
FWVKSAGV 
TLVEAZLSKB 



RMZWDHNAQL 
QOFZLEATQD 
VTAGTFCALT 
BSHPSTSLDS 



TTTTATCTGT 
TGCACACTTT 
TATCTTTCCA 
TGCAGATTTA 
TGTTGACTTA 
ATACATAAAT 
GGATGGCAAA 
AGCTTATATT 
ATY3GGAACAT 
AAAA1GTGAT 
TCAGAAGAGT 
AAAAATAAAA 
CTACAOGCAG 
GA6AAAGGCA 
TG6AGTTGGA 
CGAAOGAACT 
GGTACAAACT 
TAAAGAAACT 
TGGACCA6CA 
AGAGCA6AGT 
TTCTATCATC 
AGACTACATC 
CCAGCACCCT 
CCAACTQGTG 
GCX3UUITAAA 
ACACAAATGT 
ACAGGATGAT 
TAGCCCX3VTT 
GGATGGGCCT 
TCTOACAACC 
CAASATGATC 
CTACAAAGTG 
GGACAGTAAT 
TTAACACAGA 
AG6CAGQAAA 
TCATGACATA 
ACTTGTAATT 
ATGGAATTGT 
TAGGAATTCC 
AGCAATTATC 
TCTTCCATAT 
OTTAC^ATT 
AGATTTTTAT 
TGACGTAGTT 
AAOTCATTAA 
ATTTTGAAAG 
TCCAAGGAAT 
AAAAAAAAAA 



51 

I 

KKYPTCNSPK 
RVSGGVSBHV 
KGKGKtiRALS 
YN6SLTSPPC 
KPSRQVPSSY 
QQLDGEDQTK 
DNFELDIiFPB 
STKYMEAKIN 
VEGTSASLND 
PATSAZPFZS 
NVHFPSSTDZ 
PHYSTFAYFP 
I£SBKKAVIP 
PZSDDVGAZP 
NIVAYDHSRV 
HNVBVZVMZT 
KKGSQKGRPS 
GRTGTYZVLD 
TBVUDSBZRA 
ZPVERSRVGZ 
WMIPDQQNM 
DYVLEVRHFQ 



2520 
2580 
2640 
2700 
2760 
2820 
2860 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4260 
4320 
4380 
4440 
4500 
4560 
4620 
4680 
4740 
4800 
4860 
4920 
4980 
5040 
5100 
5160 
5220 
5280 
5340 



MGAALPDGKl 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
.720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 



85 



Seq ZD NO I 185 DMA sequence 

Nucleic Acid Accession EOS sequence 

Coding sequence t 501*-4514 



11 



21 



31 



41 



260 



wo 02/086443 

CACACATAOO CACGCHOGAT CTCACTTGSA TCTATACACT GORGQATTAA AACAAACARA 60 

CAAAAAAAAC ATTTCCTTCG CTCCCCCTCC CTCTCXaCTC TSA QAAGC AG AGGAGCCXK3^ 120 

CGGCGAGGGG CCGCAGACCG TCTGGAAATG C3GAATCCTAA AGOGTTTCCT CGCTTGCATT 180 

CAOCTCCTCT GTQTTTGCCG CCTGGATTGG GCTAATGOAT ACTACAGACA ACAGAGAAAA 240 

CTTGTTGAAG AGATTGGCTG GTCCTATACA GGAGCACTGA ATCAAAAAAT TGGGGAAAGA 300 

AATATCOiAC ATGTAATAQC CCAAAACAAT CTCCTATCAA TATTGAT6AA GATCTTACAC 350 

AAGTAAATGT GAATCTTAAG AAACTTAAAT TTCAGGCTTG QSATAAAACA TCATTGGAAA 420 

ACACATTCAT TCATAACACT GGGAAAACAG TGGAAATTAA TCTCA CTAA T GACTACCGTO 480 

TCAGCGGAGG AGTTTCAGAA ATGGTGTTTA AAGCAAGCAA GATAACTTTT CACTGGG6AA 540 

AATGCAATAT GTCATCTGAT GGATCAOAGC ATAGTTTAQA AGGACAAAAA TTTCCACTTG 600 

AGATQCAAAT CTACTGCTTT QATG08GACC GATTTTCAAG TTTTGAGGAA GCAGTCAAAG 660 

GAAAAGQGAA GTTAAOAGCT TTATCCATTT T6TTTGAGGT TGGGACAGAA GAAAATTTGG 720 

ATTTCAAA6C GATTATTGAT GGAGTOSAAA GTGTTAGTCG TTTTGGGAAG CAGGCT6CTT 780 

TAGATCCATT CATACTGTTG AACCTTCTGC CAAACTCAAC TGACAAGTAT TACRTTTACA B40 

ATGGCTCATT GACATCTCCT CCCT6CACAO ACACAGTTGA CTGQATTGTT TTTAAAGATA 900 

CAGTTAGCAT CTCTGAAAGC CAGTTGGCTG TTTTTTGT6A AGTTqTTACA ATGCAACAAT 960 

CTGGTTATGT CATGCTGATG GACTACTTAC AAAACAATTT TCQAGAGCAA CAGTACAAGT 1020 

TCTCTAGACA GGTGTTTTCC TCATACACTG GAAAGGAAGA GATTCATGAA GCAGTTT QTA 1080 

GTTCAQAACC AGAAAATGTT CAGGCTGACC CAGAGAATTA TACC3U3CCTT CTTGTTACAT 1140 

GGGAAAGACC TCGAGTCX3TT TATGATACCA TGATTGAOAA GTTTGCAGTT TTGTACCAGC 1200 

AGTTGGATG6 AGAGGACCAA ACCAAGCATG AATTTTTGAC AGATGGCTAT CAAQACTTGG 1260 

OTGCTATTCT CSATAATTTG CTACCCAATA- TGAGTTATGT TCTTCAGATA GTAGCCATAT 1320 

GCACTAATGG CTTATATGGA AAATACAGCG ACCAACTGAT TGTCGACATQ CCTACTGATA 1380 

ATCCTGAACT TGATCTTTTC CCTGAATTAA TTGGAACIGA AGAAATAATC AAGGAGGAGG 1440 

AAGAG6GAAA AGACATTGAA GAAGGCX5CTA TTGTGAATCC TG6TAGAGAC AGTGCTACAA 1500 

ACCAAATCAG GAAAAAGGAA CCCCAGATTT CTACCACAAC ACACTACAAT CGCATAGGGA 1560 

CGAAATACAA TGAAGCCAAG ACTAACCGAT CCCCAACAAG AGGAAGTQAA TTCTCTGGAA 1620 

AGGGTGATGT TCCCAATACA TCTTTAAATT CXZACTTCCCA ACCAGTCACT AAATTAGOCA 1680 

CAGAAAAAGA TATTTCCTTG ACTTCTCAQA CTOTGACTQA ACTGCCACCT CACACTGTGG 1740 

AAGGTACTTC AGCCTCTTTA AATGATGGCT CTAAAACTGT TCTTAGATCT CCACATATGA 1800 

ACTTGTCGGG GACIX3CAGAA TCCTTAAATA CAGTTTCTAT AACA6AATAT GAGGAGGAGA 1860 

GTTTATTGAC CAGTTTCAAG CTTGATACTG GAGCTGAAGA TTCTTCAGGC TCCA6TCCC0 1920 

CAACTTCTGC TATCCCATTC ATCTCTGAGA ACATATOCCA AGGGTATATA TTTTCCTCOO 1980 

AAAACOCAGA GACAATAACA TATOATGTCC TTATACC3W3A ATCTGCXAGA AATGCTTC03 2040 

AAGATTCAAC TTCATCAGOT TCAQAAGAAT CACTAAAGGA TCCTTCTATO GAGG6AAATC 2100 

TGTGGTTTCC TAGCTCTACA GACATAACAG CACAGCCOA TGTTGGATCA GGCAGAGAGA 2160 

GCTTTCTCCA QACTAATTAC ACTGAGATAC GTGTTGATGA ATCTGAGAAG ACAACCAAGT 2220 

CCTTTTCTGC AQGCCCAGTG ATGTCACAGG GTCCCTCAGT TACAG ATCTG GAAATGCCAC 2280 

ATTATTCTAC CTTTGCCTAC TTCCCAACTG AGGTAACAOC TCATGCTTTT ACCCCATCCT 2340 

CCAQAC3VACA GQATTTGGTC TCXACGGTCA AOGTGGTATA CT06CAGACA ACX!CAACOQG 2400 

TATACAAT6A GGCCAGTAAT AGTAGCCATG AGTCT06TAT TGQTCTAGCT GAGGGGTTGG 2460 

AATCCGAGAA GAAGGCAGTT ATACXXICTTG TGATCGTGTC AGCCCTGACT TTTAT CTGTC 2520 

TAGTGGTTCT T6TGGGTATT CTCATCTACT GGAGGAAATO CTTCCA6ACT GCACACTTTT 2580 

ACTTAOAOGA CAGTACATC3C CCTAGAGTTA TATCXAOWC TCCAACACCT ATCTTTCCAA 2640 

TTTCAGATGA TOTCQOAQCa ATTCCAATAA AGCStfTPTTCC AAAGCATGTT GCAGATTTAC 2700 

ATGCAAGTAO TCGGTTTACT GAAGAATTTG AGACACTGAA A6AGTTTTAC CAGGAAGTGC 2760 

AGAGCTGTAC TGTTGACTTA GGTATTACAQ CAGACAGCTC CAACCACCCA GACAACAAQC 2820 

ACAAGAATC36 ATACAKAAAT ATC3GTTGCCT ATGATCATAQ CAGGGTTAAG CTAGCACAGC 2880 

TTOCTGAAAA GGATGGCAAA CTGACTGATT ATATCAATGC CAATTATGTT GATGGCTACA 2940 

ACAGACCAAA AGCTTATATT GCTGOCCaAG GOCCACTGAA ATCCACAGCT GAAGATTTCT 3000 

GGAGAAXGAT ATGGGAACAT AATGTGGAAG TTATTGTCAT GATAACAAAC CTCGTGGAGA 3060 

AA0GAAG6AG AAAATGTGAT CAGTACTG6C CTGCCGATGG GAGTGAGGAG TAC0G6AACT 3120 

TTCTQGTCAC TCAGAAGAGT GTGCAAGTQC TTGCCTATTA TACTGTQAG6 AATTTTACTC 3180 

TAAOAAACAC AAAAATARAA AAGGGCTCOC AGAAAGGAAG ACCCAGTGGA C6TGTGGTCA 3240 

CACAGTATC3V CTACACOCAG TGGCXTGACA T6GGAGTACC AGAGTACTCC CTGCCAGTGC 3300 

TGACCTTTGT GAGAAAGGCA 6CCTATGCCA AGOGCCATGC AQTGGGGCCr G TTGTCQ TCC 3360 

ACTGCAGTGC TGGAGTTGGA AGAACAGGCA CATATATTGT GCTAGACRGT ATGTT GCAGC 3420 

AGATTCAACA CXSAAGGAACT GTCAACATAT TTGGCTTCTT AAAACACATC aSTTCACAAA 3480 

GAAATTATTT GGTACAAACT GAGGAGCAAT ATGTCTTCAT TCATGATACA CTGGTTGAGG 3540 

CC31TACTTA6 TAAAGAAAGT GAGGTGCTGG ACAGTCATAT TCATGCCTAT GTTAATGCAC 3600 

TCCTCATTCC TGGACCAGCA GGCAAAACSUi AGCXAGAGAA ACAATTCCAG CTCCTGAGCC 3660 

AGTCAAATAT ACAGCAGAGT GACTATTCTG CAGCCCTAAA GCAATGCAAC AGGGAAAAOA 3720 

ATCQAACTTC TTCTATCATC CCTGTGGAAA 6ATCAAGGGT TGQCATTTCA TCCCTGAGTG 3780 

GA6AA6GCAC AGACTACATC AATGCCTCCT ATATCATGGQ CTATTACCAG AGCAATGAAT 3840 

TCATCATTAC CCA6CACCCT CTCCTTCATA CCATCAAQGA TTTCTGGAGG ATOATATGGG 3900 

ACCATAATGC CCAACTGGTG 6TTATGATTC CT0ATGGCC31 AAA CATGG CA GAAGATGAAT 3960 

TTGTTTACTO 6CCAAATAAA GATGA60CTA TAAATTGTGA 6AGCTTTAA0 GTGACTCTTA 4020 

TGGCTGAAGA ACACAAATGT CTATCTAATO AGGAAAAACT TATAATTCAO GACTTTATCT 4080 

TAGAAGCTAC ACAG6ATGAT TATGTACTTG AAGTGAGGCA CTTTCAGTGT CCTAAATGGC 4140 

CAAATCCAGA TAGCCCCATT AGTAAAACTT TTGAACTTAT AAGTGTTATA AAAGAAGAAG 4200 

CTGCCAATAG GQATGGGCCT ATGATTGTTC ATGATGAGCA TGGAGGAGTG ACGGCAGGAA 4260 

CTTTCTGTCC TCTGACAACC CTTATGCACC AACTAGAAAA AGAAAATTCC GTGGATGTTT 4320 

ACCAGGTAGC CAAGATGATC AATCTGATGA GGCCAGGAQT CTTTGCTGAC ATT6A6CAGT 4380 

ATCAGTTTCT CTACAAAGTG ATCCTCAGCC TTGT6AGCAC AAGGCAGGAA GAGAATCCAT 4440 

CCACCTCTCT GGACAGTAAT GGTGCAGCAT TGCCT6ATGG AAATATAGCT GAGAGCTTAG 4500 

AGTCTTTAGT TTAACACAGA AAGGGGTGG6 GGGACTCACA TCTGAGCATT GTTTTCCTCT 4560 

TCCTAAAATT AG6CAGGAAA ATCAGTCTAG TTCT6TTATC TGTTGATTTC CCATCACCTG 4620 

ACAOTAACTT TCATGACATA GGATTCTGCC GCCaWVATTTA TATCATTAAC AATGTGTGCC 4680 

TTTTTGCAAG ACTTGTAATT TACTTATTAT GTTTGAACTA AAATGATTGA ATTTTACAGT 4740 

ATTTCTAAGA ATGGAATTGT GGTATTTTTT TCTGTATTGA TTTTAACAGA AA ATTTCAA T 4800 

TTATAQAGGT TAGGAATTCC AAACTACAGA AAATGTTTGT TTTTAGTCTC AAATTTTTAG 4860 

CTGTATrTGT AGCRATTATC AGGTTTGCTA GAAATATAAC TTTTAATACA GTAGCCTGTA 4920 

AATAAAACAC TCTTCCATAT GATATTCAAC ATTTTACAAC TGCAGTATTC ACCTAAAGTA 49B0 

GAAATAATCT GTTACTTATT GTAAATACTG CCCTAGTGTC TCXATGGACC AAATTTATAT 5040 

TTAT/VATTGT AGATTTTTAT ATTTTACTAC TGAGTCAAiGT TTTC TASIT C TGTGTftATIG 5100 

TTTAGTTTAA TGACJGTAGTT CATTAGCTGG TCTTACTCTA CCAGTTTTCT GACATTGTAT S160 
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TGTGTT A CCT AA6TCATTAA CTTTGTTTCA GCftlGTAATT TTAACTTTTG TGGAAAATAS 
AAATACCTTC ATTTTGAAA6 AAOTTTTTAT GAOAATAACA GCTTAGCAAA CATTGTTCAA 
ATGGTTTTTA TCCAA66AAT T6CAAAAATA AATATAAATA TTGCCATTAA AAAAAAAAAA 
AAAAAAAAAA AAAAAAAAAA AAA 

Smq iD-KOx 186 Protein sequence: 
Protein Acceesioxi #: EOS sequence 



5220 
5280 
5340 



PCT/US02/12476 



10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



1 
I 

MVFKASKITF 
LSILFEVGTE 
PCTDTVDWIV 
SYTGKEBIHE 
TKHEPLTDGY 
PELIGTBEII 
TNRSPTRGSE 
NDGSKTVLRS 
ISEMISQGYI 
DITAQPDVGS 
FPTEVTPHAP 
IPLVIVSALT 
IPIKHFPKHV 
IVAYDHSRVK 
NVEVIVMITN 
KGSQKGRPSG 
RTGTYIVLDS 
EVLDSHI3AY 
PVERSRV6IS 
VMIPDGQNMA 
YVLEVRHFQC 
IMHQZiEKSIS 
GAALPDGRIA 



11 
I 

HW6KCKMSSD 
EHLDFKAIID 
FKDTVSISES 
AVCSSEPENV 
ODLGAZLNHIi 



PSGKGDVPNT 
PHMNLS6TAE 
PSSiENPETIT 
GRESFLQTNY 
TPSSRQQDLV 
FICLWLVGI 
ADLHASSGFT 
LAQLAEKDGK 
LVEKGRRKCD 
RWTQYHYTQ 
MLQQIQHBGT 
VMMiblPGPA 
SLSOEGTDYI 
EDBFVYWPNK 
PKWPNPDSPI 
VDVYQVAKMI 
ESLESLV 



21 
I 

GSBHSIiEGQK 
GVBSVSRFQK 
QLAVPCBVLT 
QADPENYTSL 
LPNMSYVLQI 
EGAIYNPORD 
SIiNSTSQPVT 
SLNTVSITEY 
YDVLIPESAR 
TEIRVDBSEK 
STVNWYSQT 
LIYWRKCPOT 
EEFETLXEFY 
LTDYINANYV 
QYWPADGSEB 
WPDMGVPBYS 
VNIFGFLKHI 
GKTKLEHQFQ 
NASYIMGYYQ 
DEPIHCESFK 
SKTFBIiISVI 
MIiMRPGVFAD 



31 

1 

FPLEMQIYCF 
QAALDPPILL 
MQQSGYVMLM 
IiVTWERPRW 
VAICTNGLYG 
SATNQIRKKE 
KLATEKDISXi 
EKESLLTSFK 
NASEDSTSSG 
TTKSPSAGPV 
TQPVYNBASN 
AHFYLEDSTS 
QBVQSCTVDIi 
DGYNRPKAYI 
YQTPLVTQKS 
LPVLTPVRKA 
RSQRMYI.VQT 
J4I1SQSHIQQS 
SHEFIITQHP 
VTLMAEEEKC 
KEEAANRZX3P 
IBQYQPLYKV 



41 

1 

DADRFSSFEE 
NLLPNSTDKY 
DYLQNNFREQ 
YDTMIBKFAV 
KYSDQI»IVDM 
PQISTTTHYN 
TSQTVTELPP 
LDTGAEDSS6 



MSQGPSVTDL 
SSHESRIGLA 
PRVISTPPTP 
GITADSSNHP 
AAQGPLXSTA 
VQVLAYYTVR 
AYAKRHAVGP 
EBQYVFIHDT 
DYSAAIiKQar 
XiLBTZKDFWR 
LSNEBKLIIQ 
MIVHDEHGGV 
ZLSLVSTRQE 



51 
I 

AVKGKGKLRA 
YIYNGSIiTSP 
QYKFSRQVPS 
LYQQUXSEDQ 
PTDNPELDIiF 
RIGTKYNEAK 
HTVBQTSASXi 
SSPATSAIPP 
E6NVNPPSST 
EMPHYSTFAY 



IFPISDDVGA 
DNXKRNRYIV 
EDFHRMIHEB 
NFTLRNTKIK 
VWHCSAGVG 
LVEAIIiSKET 
REKNRTSSIZ 
KIWDHNAQLV 
DFILEATQDD 
TAGTPCALTT 
ENPSTSIiDSN 



Seq ID 170: 187 DNA sequence 

Nucleic Acid Accession #: EOS sequence 

Coding sequence: 148-4632 



C3VCACATACG 
CAAAAAAAAC 
CQ0GGAG6G0 
CAGCTCCTCT 
CTT6TTGAAG 
AAATATCCAA 
CAAGTAAAT6 
AACACATTCA 
GTCAOGGGAG 
AAATGCAATA 
GAGATGCAAA 
GGAAAAGGGA 
GATTTCAAA6 
TTAGATCCAT 
AATGOCTCAT 
ACAGTTAGCA 
TCTG6TTATG 
TTCTCTAGAC 
AGTTCA8AAC 
TGGGAAA6AC 
CAGTTGGATG 
GGTGCTATTC 
T6CACTAATG 
AATCCTGAAC 
GAAGAGGGAA 
AACCAAATCA 
AOSAAATACA 
AAGGGTGAT6 
ACAGAAAAA6 
.0AA6GTACTT 
AACTTGTCG6 
AGTTTATTGA 
GCAACTTCTG 
GAAAACCCAG 
GAAGATTCAA 
GTGT6GTTTC 
AGCTTTCTCC 
TCCTTTTCTG 
CATTATTCTA 
TCCAGACAAC 
OTATACAATG 
GAATCCGAQA 
CTAGTGGTTC 
TACTTAGAGG 
ATTTCAGATG 
CATGCAAGTA 
CAGA6CT6TA 



11 

! 

CACGCAOGAT 
ATTTCCTTOG 
CC6CAGACCX3 
GTGTTTGCCO 
AQATTGGCTG 
CATGTAATAG 
TGAATCTTAA 
TTCATAACAC 
6A6TTTCAGA 
TGTCATCTGA 
TCTACTGCTT 
AGTSAAGAGC 
C96ATTATTGA 
TCATACTCTT 
TGACATCTCC 
TCTCTGAAAQ 
TCAT6CTGAT 
A6QIGTTTTC 
C3U3AAAAT6T 
CTC6A6TCGT 
GAGAGGACCA 
TCAATAATTT 
GCTTATAT6G 
TTGATCTTTT 
AAGACATTGA 
6GAAAAAG6A 
ATGAAGCCAA 
TTCCCAATAC 
ATATTTCCTT 
CAGCCTCTTT 
G6ACTGCAGA 
CCAGTTTCAA 
CTATCCCATT 
AGACAATAAC 
CTTCATCAGG 
CTAGCTCTAC 
AGACTAATTA 
CAGGCCCAGT 
CCTTTGCCTA 
AGGATTTGGT 
AGGCCAGTAA 
AGAAGGCA6T 
TTGTGGGTAT 
ACAGTACATC 
ATGTCGGAGC 
GTGOGTTTAC 
CTGTTGACTT 



21 

1 

CTCACTTCGA 
CTCCCCCTCC 
TCT6QAAATG 
CCTG6ATTGG 
GTCCTATACA 
CCCAAAACAA 
GAAACTTAAA 
TOOQAAAACA 
AAT66TGTTT 
TGGATCASAG 
TGATGCGGAC 
TTTATCCATT 
TGOAGTOGAA 
GAAOCTTCTG 
TCCCTGCACA 
CCAGTTGGCT 
G6ACTACTTA 
CTCATACACT 
TCAGGCTGAC 
TTATGATACC 
AACCAAGCAT 
6CTACCCAAT 
AAAATACAGC 
OOCTGAATTA 
AGAAGGOGCT 
ACCCCAGATT 
GACTAACOSA 
ATCTTTAAAT 
GACTTCTCAG 
AAATGATGGC 
ATCCTTAAAT 
GCTT6ATACT 
CATCTCT6AQ 
ATATGAT6TC 
TTCAGAAGAA 
AGACATAACA 
CACTGAOATA 
6ATGTCACAG 
CTTCCCAACT 
CTCCA0G6TC 
TAOTAGCCAT 
TATACCCCTT 
TCTCATCTAC 
CCCTAGAGTT 
AATTOCAATA 
TGAAGAATTT 
A6QTATTACA 



31 

1 

TCTATACACT 
CTCTCCACTC 
CX3AATCCTAA 
GCTAATG6AT 
GGAGCACTGA 
TCTCCTATCA 
TTTCAGGGTT 
GTOGAAATTA 
AAAG CAAG CA 
CATAGTTTAO 
OGATTTTCAA 
TTGTTTGAGG 
AGTGTTAQTC 
CCSUU^CTCAA 
GACACAGTTG 
GTTTTTTGTG 
CAAAACAATT 
GGAAAGGAAO 
CCAGAGAATT 
ATGATTGA6A 
GAATTTTTGA 
ATGAQTTATG 
GACCAACT6A 
ATTGGAACTG 
ATTGTGAATC 
TCTACGACAA 
TCCCCAACAA 
TCCACTTCCC 
ACTGT6ACTG 
TCTAAAACTG 
ACAGTTTCTA 
GGAGCTGAAG 
AACATATCCC 
CTTATACCAG 
TCACTAAAGG 
GCACAGCC06 
G6TGTTGATG 
GQTCCCTCAG 
GAiGGTAACAC 
AACX3TGGTAT 
GAGTCTOGTA 
GTGATOGTGT 
TGGAGGAAAT 
ATATCCACAC 
AAGCAC7TTC 
GAOACACTaA 
6CAGACAGCT 



41 

I 

GGAGGATTAA 
TGAGAAGCA6 
AAGGTTTCCT 
ACTACAGACA 
ATCAAAAAAA 
ATATTGATGA 
GG6ATAAAAC 
ATCrCACtAA 
AGATAACTTT 
AA6GACAAAA 
GTTTTGAGGA 
TTGG6ACAGA 
GTTTTGG6AA 
CTQACAAQTA 
ACTGGATTCT 
AW3TTCTTAC 
TTCGAGAGCA 
A6ATTCATGA 
ATACCAGCCT 
AGTTTGCAQT 
CAOATGGCTA 
TTCTTCA6AT 
TTGTGGACAT 
AA8AAATAAT 
CTGGTAGAGA 
CACACTACAA 
GAGGAAGTGA 
AACCAGTCAC 
AACT6CCACC 
TTCTTAQATC 
TAACAGAATA 
ATTCTTCAGG 
AA6GGTATAT 
AATCTGCTAG 
ATCGTTCTAT 
ATGTTQGATC 
AATCTGAGAA 
TTACAGATCT 
CTCATGCTTT 
ACTCX3CAGAC 
TTQGTCTAGC 
CAGCCCTGAC 
GCTTCCAGAC 
CTCCAACACC 
CAAA6CAT6T 
AAGAGTTTTA 
CCAAGGACCC 



51 

I 

AACAAACAAA 
AG6AGC06CA 
CGCTTGCATT 
ACAGAGAAAA 
TTGGGGAAAG 
AGATCTTACA 
ATCATTGGAA 
TGACTAGOST 
TCACTGGGOA 
ATTTCCACTT 
AGCAGTCAAA 
AGAAAATTTG 
GCAGGCT6CT 
TTACATTTAC 
TTTTAAAGAT 
AATGCAACAA 
ACAGTACAAG 
AGCA6TTTGT 
TCTT6TTACA 
TTTGTACCAS 
TCAAGACTTG 
AGTAGCCATA 
GCCTACT6AT 
CAAGGAGGAG 
CAGTGCTACA 
TC6CATAGGG 
ATTCTCTGGA 
TAAATTAGCC 
TCACACTGTG 
TCCACATATG 
TGAGGAG6AS 
CTCCA6TCCC 
ATTTTCCTCC 
AAATGCTTCC 
GGAGGGAAAT 
AG6CAGAGAG 
GACAACCAA6 
GGAAATGCCA 
TACCCCATCC 
AACCCAACC6 
T6AGGGGTTG 
TTTTATCTOT 
TGCACACTTT 
TATCTTTCCA 
TGCAGATTTA 
CCAGGAAGT6 
AGACAACAAG 



60 
120 
ISO 
240 
300 
360 
420 
4B0 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
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CACAAGAATC GATACATAAA TATCGTTGCC TATGATCATA GCAQGGTTAA GCTAGCACAG 2880 

CTMCTQAAA AOQATCGCAA ACTGACT6AT TATATCAATO CCAATTATGT TGATGGCTAC 2940 

AACAGACCAA AAGCTTATAT TOCTGCCCAA OGCCCACTGA AATCCACAGC TGAAGATTTC 3000 

TGQW3AATGA TATGGGAACA TAATGTGGRA GrTTATTGTCA TGATAACAAA CCTG6TGGAG 3060 

AAAGGAAGGA GAAAATGTGA TCAGTACTGG CCTGCCGATG GGAGTGAGGA GTACGGGAAC 3120 

TTTCTGGTCA CTCAGAAGAG TOTGCAAGTG CTTGCCTATT ATACTGTGAG GAATTTTACT 3180 

CTAAGAAACA CAAAAATAAA AAAGGGCTCC CAGAAAGOAA QACCCAGTGG ACGTGTGGTC 3240 

ACACAGTATC ACTACACGCA QTGaCCTOAC ATGGSA6TAC CAGAGTACTC CCTGCXAGTC 3300 

CTGACCTTTG TOAGAAAGGC AGCCTATGCC AAGCGCCATG CJWSTGGGGCC TGTTGTCGTC 3360 

CACTGCAGTG CTGGAGTTGG AAGAACAGGC ACATATATTG TGCTAGACAG TATGTTGCAG 3420 

CAGATTCAAC ACGAAGQAAC TGTCAACATA TTTGGCTTCr TAAAACACAT CC3GTTCACAA 3480 

AGAAATTATT TG6TACAAAC TGAGGAGCAA TATGTCTTCA TTCATGATAC ACTGGTT6AG 3540 

GCCA.TACTTA GTAAAGAAAC TGAGGTGCTG GACAGTCATA TTCATGCCTA TGTTAATGCA 3600 

CTCCTCATTC CTGGACCAGC AGGCAAAACA AAGCTAGAGA AACAATTCCA GGGTCTCACT 3660 

CTQTCACCCA 6GCTGGAGTG CAGAGGCACA ATCTCGGCTC ACTQCAACCT TCXTTCTCCCT 3720 

QGCTTAACTG ATCCTCCTAC CTCAGCCTCC CQAQTGGCTG GGACTATACT CCTGAGCCAG 3780 

TCAAATATAC AGCAGAGTGA CTATTCTGCA GCCCTAAAGC AATGCAACAG GGAAAAGAAT 3840 

OBAACTTCTT CTATC3VTCCC TGTGQAAAOA TCAAGGGTTG GCATTTCATC CCTGAGTGGA 3900 

GAAQGCACAG ACTACATCAA TGCXTTCCTAT ATCATGGGCT ATTACCAGAG CAATGAATTC 3960 

ATCATTACXTC AGCACCCTCT CCTTCATACC ATCAAGGATT TCTGGAGGAT GATATGGOAC 4020 

CATAATGCX:C AACTGGTGGT TATGATTCCT GATGGCCAAA ACATGGCAGA AGATQAATTT 4080 

GTTTACTGGC CaAATAAAOA TGA6CCTATA AATTGTGAGA GCTTTAAGGT CACTCTTATQ 4140 

QCIGAAOAAC ACAAATGTCT ATCTAATGAG GAAAAACTTA TAATTCAGGA CTTTATCTTA 4200 

GAAGCTACAC AGGATGATTA TGTACTTGAA GTGAGGCACT TTCAQTOTCC TAAATGGCCA 4260 

AATCCAGATA QCCCCATTAG TAAAACTTTT GAACTTATAA GTGTTATAAA AGAAGAAGCT 4320 

GCCAATA6GG ATGGGCCTAT GATTGTTCAT GATGAGCAT6 GAGQAOTGAC QOCAGGAACT 4380 

TTCTGTGCTC TGACAACCCT TATGCACCAA CTAGAAAAAG AAAATTCCGT GGATGTTTAC 4440 

CAGGTAGCCA AGATGATCAA TCTGATGAGG CCAGGAGTCT TTGCTGACAT TGAGCAGTAT 4500 

CAGTTTCTCT ACAAAGTGAT CCTCAGCCTT GTGG6CRCAA GGCAGGAAGA 6AATCCATCC 4S60 

ACCTCTCTGG ACAGTAATGG TGCAGCATTO CCTOATGOAA ATATA6CT6A GAQCTT AGAG 4620 

TCTTTAGTTT AACACAGAAA GGGGTGGGGQ GACTCACATC TGA6CATTGT TTTCCTCTTC 4680 

CTAAAATTAG GCAGGAAAAT CAGTCTAGTT CTGTTATCTG TTGATTTCCC ATCACCTGAC 4740 

AGTAACTTTC ATGACATAGG ATTCTQCXX3C CAAATTTATA TC3VTTAACAA TGTQTGCCTT 4800 

TTTOCAAGAC TT6TAATTTA CTTATTAT6T TTGAACTAAA AIGATTGAAT TTTACAGTAT 4860 

TTCTAAGAAT GGAATTGTGG TATTTTTTTC TGTATTGATT TTAACRGAAA ATTTCAATTT 4920 

ATAGAGGTTA GOAATTCCAA ACTACaOAAA ATGTTTGTTT TTAGTGTCAA ATTTTTA6CT 4980 

GTArTTGTAG CAATTATCAO GTTTGCTA6A AATATAACTT TTAATACAGT AGCCTGTAAA 5040 

TAAAACACTC rrCCATATGA TATTCAACAT TTTACAACTG CAGTATTCAC CTAAAGTAGA 5100 

AATAATCTGT TACTTATTGT AAATACTGCC CTAGTGTCTC CATCGACCAA ATTTATATTT 5160 

ATAATTOTAO ATTTTTATAT TTTACTACT6 AGTGAA6TTT TCTAGTTCTG TOTAATTGTT 5220 

TAGTTTAATC AOGTAGTTCA TTAGCTGGTC TTACTCTACC AGTTTTCTGA CATTGTATTG 5280 

TGTTACCTAA GTCATTAACT TTGTTTCAGC ATGTAATTTT AACTTTTGTG GAAAATAGAA 5340 

ATACXrrrCAT TTTGAAAGAA GTTTTTATQA GAATAACACC TTACCAAACA TTGTTCAAAT 5400 

GGTTTTTATC CAAGGAATTG CAAAAATAAA TATAAATATT 6CCATTAAAA AAAAAAAAAA 5460 
AAAAAAAAAA AAAAAAAAAA A 

Seq ID NO I 188 Protein sequences 
Protein Accession #s EOS sequenca 

1 11 21 31 41 51 

1 i I ' 1 ' 

MRILKRPIiAC IQLLCVCRLD WANGYYRQQR KLVEEIGWSY TGALNQKNWG KKyPTOISPK 60 

QSPINIDEDL TQfVNVNLKKL KFQGWDKTSL ENTPIHNTGK TVEINLTNDY RVSGGVSEMV 120 

FKASKITFHW GKCNMSSDGS EBSIiBGQKFP LEMQIYCFDA DRF5SFEEAV KGKGKIjSALS 180 

ILFBVGTEEN LDFKAIIDGV ESVSRFGRQA AXJ}PFII>IiML ItPHSTDinrYI YIlGSbTSPPC 240 

TDTVDWIVPK DTVSISBSQL AVFCEVLTMQ QSGYVMtHDY LQMHPREQQY KPSHQVPSSY 300 

TQKBEIHEAV CSSBPEHVQA DPENYTSLIiV TWERPRWYD TMIEKFAVLY QQLDGBDQTK 360 

HEPLTDGYQD LGAIUINLLP NMSYVLQIVA ICTNGLYGKY SDQLIVDMPT DNPELDLPPE 420 

LIGTEEIIKB EBEGKDIEEG AIVNPQRDSA TKQIRKKEPQ ISTTTHYNRI GTKmSAKTN 480 

RSPTRGSEFS GXiaSVPNTSL NSTSQFVTKL ATEKDISLTS QTVTELPPHT VEGTSASUID 540 

GSKTVLRSPH MHLSQTAESL HTVSITBYEB ESLLTSFKLO TOAEDSSGSS PATSAIPFIS 600 

ENISQGYIPS SENPETITYD VLIPESARNA SBDSTSSGSB ESLKDPSMEG NVWFPSSTDI 660 

TAQPDVGSGR ESPLQTNYTE IRVDESEKTT KSFSAGPVMS QQPSVTDLEM PHYSTFAYFP. 720 

TKVTPHAPTP SSRQQDLVST VNWYSQTTQ PVYNEASNSS HESRIGLAEG LESEKKAVIP 780 

LVIVSAliTPI CLWliVGIU: YWRKCPQTAH FYLEDSTSPR VISTPPTPIF PISDDVGAIP 840 

ZKBFPKKVAD LHASSGFTEE PBTLKEPYQE VQSCTVDLGI TADSSNHPDN KHKNRYINIV 900 

AYDHSRVKLA QLAEKDGKLT DYINANYVDG YNRPKAYIAA QGPLKSTAED PWRMIWEHNV 960 

BVIVMITNLV EIOGRRKCDQY WPADGSEEYG NFLVTQKSVQ VLAYYTVROT TLRNTKIKKG 1020 

SQKGRPSGRV VTQYHYTQWP DKGVPEYSIiP VLTFVRKAAY AKRHAVGPW VHCSAOVORT 1080 

GTYIVU)SML QQIC^EGTVH IFGFLKHIRS QRKYLVQTEE QYVFXHDTLV EAILSKETEV 1140 

LDSHI2AYVN ALL1P6PAGK TKLEKQFQGL TLSPRLECRG TISAHCNLPL PGLTDPPTSA 1200 

SRVAGTILliS QSNIQQSDYS AALKQQIREK NRTSSXIPVE RSRVGISSLS GEGTDYINAS 1260 

YIMGYYQSNE FIITQHPLLH TIKDPWRMIW DHNAQLWMI P06QNMAEDE FVYHSMKDEP 1320 

INCBSPKVTL MAEEHKCLSN BEKLIIQDFI LEATQDDYVL EVSHFQCPKW PNPOSPISKT 1380 

FELXSVIKEB AANRDGFNIV EDBHG6VTAG TFCAIiTILKH QbBKBNSVDV YCfVAKHINIM 1440 
RPGVPADIEQ YOFLYKVILS LVGTRQEBHP STSLDSKOAA LFDGinABSIi ESLV 



Seq ID HO: 189 SNA sequence 
Nucleic Acid Accession fti im_002820 
Coding sequence: 304.. 631 

1 11 21 31 41 51 

1 I I I I I 

CCGGTTCGCA AAGAAGCTGA CTTCAQAQGG GGAAACTTTC TTCTTTTAGG AGGCGGTTAG 60 

COCTGTTCCA CGAACCCAGG AGAACTGCTG GCCAGATTAA TTAGACATTG CTATGGGAGA 120 

OGTGTAAACA CACTACTTAT CATTGATGCA TATATAAAAC CATTTTATTT TCGCTATTAT 180 
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TTCAGAGGAA GCGCCTCTGA TTTGTTTCTT 
GTTTGGAGAA AGCACAGTTG C3AQTAGCCGQ 
AGGATGC3WSC GOAOACTGGT TCAGCAQTGG 
OTGCCCTCCT GCGGGOGCTC GGTGOAGGGT 
QAACATCAGC TCCTCCATGA CAAGGGGAAG 
CTTCACCATC TGATCGCAGA AATCCACACA 
CCTAACTCCA ASCCCTCTCC CAACACAAAG 
GAGGGCAGAT ACCTAACTCA GGAAACTAAC 
AAGACACCTG GGAAGAAAAA GAAAGGCAAG 
AAACGGCGAA CTCGCTCTGC CTGGTTAGAC 
GACCACCTGT CTGACACX:TC CACAACX3T0Q 
CT6GCCCX3TA GCCTCAGCGG . GGTGCTCTCA 
GCTTGGACAA ACCTAGAATT TTCTCXICTTT 
CAGA6AATAA CTCAQAATAT TGTCTGCCTT 
TGTCCTCCAG CAOCATAQAG AGGCGCTAQA 
CATCAATCCT TTACCACTCT ACCAAATAAT 
ATCTTCATAA TTTGCTGGAG AAGTGTATTT 
TTCTTCAGTG TTTTTCATTT CTTAOGTTCT 
GATATTATCT ACAAACACTG CAGAACAGCA 
ACTTTTTATT TAATTAAATG TATTTAATTA 
TAAATTATGT TTTAAACACA TGCCTTAAAT 
CCAGCrCATA CAAAATAAAT GGTTTCT6AA 
GGTTTTTCTC ATGTATCTTT TTGTTCATTG 
CGGTAGGAAA AATAAAACTT CACATTTAAA 



Seg ID NOt 190 Protein sequence: 
Protein Accession «i 1IP_002811 



PCT/US02/12476 



TTTTCCCTTT 
TTGCTAAATA 
AG06T06CGG 
CTCAGCCGCC 
TCCATCCAAG 
GCTGAAATCA 
AACCACCCCG 
AAGGTGGAGA 
CCCGGGAAAC 
TCTGGAGTGA 
CTGGAGCTCG 
GCTGGGTTTT 
ATGTATCTCT 
AAA6CAGTAC 
6CCCATTCCT 
TTCATATTCA 
CTTCCCXrTTA 
TTCACTTCAA 
TCATGTCATA 
AATCTCAAAT 
TTGTTTAATT 
AATGTTTAAG 
GCAAGATGAA 



TTGCTcrrrc 

AGTCCCGAGC 
TGTTCCTGCT 
6CCTCAAAAG 
ATTTACGGCG 
GAGCTACCTC 
TCCGATTTGG 
CGTACAAAGA 
GCAAGGAGCA 
CTGGGAGTGG 
ATTCACGGTA 
GQAQCCTCCC 
ATOQATTQTG 
CCCC CTACCA 
CTTTCTCCAC 
AGCTTCAGAA 
CTCTCACAOC 
GGGAGAATAT 
AAOGATTCTG 
TTATTTTAAT 
AAATTTAACT 
TATTAACTTA 
ATAATTTTTC 



TGGCTGTGTG 
GCGAGCGGAG 
GAGCTACX3CG 
AGCTGTGTCT 
ACGATTCTTC 
GGAGGTGTCC 
GTCTGATGAT 
GCAGCOGCTC 
GGAAAAOAAA 
GCTAQAAQGG 
ACAGGCTTCT 
TTCTGCCTTG 
TAGCAATTGA 
GACACACCCC 
OGTCACCCAA 
GCTAGTGACC 
TGGGCAAACr 
AGAA6CATTT 
AGCCATTCAC 
OXAAAGAACT 
CTGGTTTCTA 
C3UU36ATATA 
TAGGQTAATO 



11 



21 



51 



31 41 

1 i I I I I 

MQRRLVQQWS VAVFLLSYAV PSCGRSVB6L SRRLKRAVSE HQLLHDKGKS IQDLSRRFPL 
HELIAEIHTA EIRATSEVSP NSKPSF29TKK KPVRFGSDDB GRYLTQETHK VBTYKEQPLK 
TPQKKXKGRP GRRKBQEKKK RRTRSANLDS GVTOSOLEQD HLSDTSTTSL BLDSR 

Seq ID NO: 191 DNA sequence 
Nucleic Acid Accession #: XH_059328 
Coding sequence: 52.* 102 3 



1 

I 

GGGCTGTCCG 
CCTOGCATGC 
GGTATG6T6G 
GC6G0CA0Q6 
GCCAACCTGT 
QGCC OGQAA G 
OTGOATTTGC 
CT6GGCAGGG 
TGCCA^TGr 
6AG06CGGTG 
GTGGAGOSQG 
GAC6CCTTCG 
GCCCTGOCGC 
GCGCACCCOQ 
TTCTCTTGCT 
GCCCAGCTTG 
CCA6GGGAGG 
TaACCXXX:TA 
TTAGTCCTGG 
GGACACTGCC 
AGCCTTCTTG 
TGGTGCCCCT 
CTATATTAAT 



11 
I 

GCCCACTCCC 
GCCTGGTGGT 
ASGCCTTTCT 
A6A606CGGC 
CCX3AGG6COG 
GCTTCTTCCT 
CTCA6GTG0G 
CCCCCA03CA 
TCGCOGAGGC 
TGGGTGGCTG 
AOGCCOGGGC 
TGGGCCTGAG 
GG6TCCTGGA 
6CTACC0CA0 
CTTGGGAGCX3 
CCCAGGATGG 
AGGTCCCCT6 
CMSACAACCA 
CCCAGCGCA6 
ACCTCTGGGC 
GCTGCAGGCA 
CCATGTTGCA 
AAAATAACGT 



21 

I 

CTGGGAGCGC 
CACOGCGGAC 
GQGCQGOGCT 
Q6A0CTGGCC 
CCCO^TGGGT 
T6GCAAGATG 
GGAGGAGCTC 
GGCGGAOGGG 
GCTGCAGGCC 
CACTTGGCTG 
G6CQGTGGGC 
CACTTG06GC 
AG6TACCCTA 
TGTGCCTCCC 
GCTGCATGAG 
CGTGCAGCTT 
TGAGCCCACT 
AOCACTAATC 
A6CTGGGACC 
TCA6GTCCTC 
GGCCTAGCCT 
ATGCAAACAC 
GTGTCTTTC 



31 

1 

GA6CG6T6GA 
GACTTTG6TT 
GTGACCAOGO 
08CA6GCACA 
CCGGCCCX3CC 
GGATTCCGGG 
6AGGCCCAAC 
CACCAGCAC6 
TATGGGGT6C 
6AGGCCCCCX3 
CCCTTCTCCC 
GGGCACATGT 
GGGGGOCACA 
ACOSGGGGCT 
CT6CGCGTCC 
TGCGCCCTCG 
CTGGAACCCT 
CCCTTAGTAC 
TGGAGCACGA 
ATGCCTCCAA 
GTGGCA6CGG 
CTTCACCACT 



41 

I 

CCCAGGCGGC 
ACTGCCC6CG 
TGTCCCTOCT 
GCATCGCCAC 
GTGGCGCCTC 
AGGCGGTGGC 
TAA6CTGCTT 
TGCACGTGCr 
GCTTTACOCG 
CGCGTCCCTT 
GCCA06GCCT 
CCGCTCACOG 
CXICTGACAGC 
GCGGTGAAGG 
TCACCGCGCC 
ACGACCTGGA 
TCCTGGAACC 
CAAGAAAGGG 
TCTGTTGACT 
ATGGCATCTA 
GCTAGGGCCC 
GGGGCAGT6G 



51 
1 

CATGTCCCGC 
AGGOSATGAG 
GGTCAAGQGT 
G6GCCTCCAC 

ATCGCTGCTC 
GGCCGGAGAC 
CCGGGAGCTG 
CCCASGCGTG 
ACTGCGGCTG 
CGCCTGCGCC 
GOGGTGGACA 
CGTGTCCGGG 
CGAGCTGATG 

cccx:gacgct 

CACGCTGCGG 
CTCCAAGAGG 
CTCXTCTACTC 
GAGCCAGGAT 
TCCCTGGGTA 
GAGTTTGAGC 
GCACSAGCATT 
OQAOAGArGG 



Seq ID NO: 192 Protein sequence: 
Protein Accession #: XP_059328 



1 11 

I I 
MSRPRMRLW TADDFOYCPR 
GIiHANLSEGR PVGPARRGAS 
RELLGRAPTH ADGHQHVHVL 
ACAVERDARA AVGPFSRH6L 
EUIAHPGYP^S VPPTGGCGBO 
SKRPGBEVPC EPTIiEPFLEP 



Seq ID NO: 193 DNA sequence 

Nucleic Acid Accession #: NM_00S688.1 

Coding sequence: 126.. 4439 



240 
300 
360 
420 
460 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 



60 
120 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 



21 31 41 51 

I I I I 

R0E6IVBAPL AOAVTSVSLL VNGAATESAA ELARRHSIPT 
SLLGPBGFFL 6KMGFR&AVA AGDVDLPQVR EBLEAQLSCF 
PGVOQVPAEA LQAYQVRFTR LPLERGVGGC TWLEAPARAP 
RWTDAPVGLS TOGRHMSAHR VSGALARVLB GTLAGHTLTA 
PDAFSCSWER UIELRVLTAP TLRAQLAQDG VQLCALDDIiD 
SLL 



60 
120 
180 
240 
300 



11 



21 



31 41 51 

I I I I 1 I 

CC3GGGCAGGT GGCTCATGCT CGGGAGCGTG GTT6AGCGGC TGGC6CGGTT GTCCTGGAGC 
AQ6GGCGCAG GAATTCTGAT QTGAAACTAA CAGTCTQTGA GCCCTGGAAC CTCCGCTCAG 



60 
120 



264 



wo 02/086443 

AGAAGATOAA GGATATCGAC ATAG6AAAAG A6TA7ATCAT OCCCAGTCCT GGGTATAGAA 180 

GTQTQAGOGA OAGAAGCAGC ACTTCTGGC3A CXSCACAOAOA GC30TQAAGAT TCCAAeTTCA 240 

GQAGAACTCG AC06TTGGAA TGCCAAGA1Q CCTTGQAAAC AGCAOCCOQA GCOOAGGGCC 300 

TCTCTCTTGA TGCCTCCATG CATTCTCAGC TCAGAATCCT OQATGAGGAG CATCCXAAGG 360 

5 GAAAGTACXaV TCATGaCTTQ AGTQCTCTOA AGCCCATCCG GACTACTTCC AAACACCA6C 420 

ACCCAGTGGA CARTGCTGGG CTTTTTTCCT QTATGACTTT TTCGTGGCTT TCTTCTCTGG 480 

CCCaTGTOQC CCACAAGAAG GGGGAOCTCT CAATGGAAOA 0GTGTG6TCT CTGTCCAAGC 540 

AGQAGTCTTC TGACQTQAAC TGCASAAGAC TAGA6A6ACT GIGGCAAGAA 6AGCTGAATO 600 

AAGTTGOGCC AGACGCTGCT TCCXTCCGAA GGGTTGTGTG GATCTTCTGC CGCACCAGGC 660 

10 TCATCCTGTC CATCGTGTGC CTGATGATCA OGCAGCTGGC TGGCTTCAQT GQACCAGCCT 720 

TCATGGTGAA ACACCTCTTG GAGTATACCC AGGCAACAGA GTCTAACCT6 CAGTACAGCT 780 

TGTTGTTAGT GCTG6QCCTC CTCCTGACX3G AAATCGTGCG QTCTTGGTOG CTTGCACTGA 840 

CTTGGGCATT 6AATTACGGA ACCX3GTGTCC GCTTGOSGGG G6CCATCCTA ACCATGGCAT 900 

TTAAGAAGAT CCTTAA6TTA AAQAACATTA AAGAGAAATC CXrTGGGTGAG CTCATCAACA 960 

15 TTTGCTCCAA CGATGGGCAG AOAATGTTTG AGGCAGCAGC OGTTGGCAGC CTGCTGGCTG 1020 

GAGGACC06T TGTTGCCATC TTAGGCATGA TTTATAATGT AATTATTCTG GGACCAACAG 1080 

GCTTCCTGGG ATCAGCTGTT TTTATCCTCT TTTACCCAGC AATGATGTTT GCATCACGGC 1140 

TCACAGCATA TTTCftGGAGA AAATGCX5TGG CCX3CCACGGA TGAACGTGTC CAG AAGA TGA 1200 

ATGAAGTTCT TACTTACATT AAATTTATCA AAATGTATGC CTGGGTCAAA GCATTTTCTC 1260 

20 AGAGTGTTCA AAAAATCCGC GAGGAGGAGC GTCGGATATT GGAAAAAGCC GGGTACTTCC 1320 

AGG6TATCAC TGTG6GTGTG GCTCCCATTG TGGTGGTGAT TGCCAGOGTG GTOACXTTTCT 1380 

CTGTTCATAT GACCCTGGGC TTCGATCTGA CAGCAGCACA GGCTTTCACA GTGGT6ACAG 1440 

TCTTCAATTC CATGACTTTT GCTTTGAAAG TAACACOGTT TTCAGTAAAG TCCCTCTCAG 1500 

AAGCCTCAGT GGCTGTTGAC AGATTTAAGA GTTTGTTTCT AATGOAAGAG GTTCACATGA 1560 

25 TAAAGAACAA ACCAGCCA6T CXTTCACATCA AQATASAGAT 6AAAAATGCC AGCTTG6CAT 1620 

GGGACTCCTC CCACTCCA6T ATCCAGAACT CGCCCAAGCT GACCCCGAAA ATGAAAAAAG 1680 

ACAAGAGGGC TTCCAGGGGC AAOAAAGAGA AGGTQAQGCA GCTGCAGCGC ACTGAGCATC 1740 

AGGCX3GTGCT OGCAGAGCAG AAAGQCCACC TCCTCCTGGA CA6TGA0GAG CX3GCCCAGTC 1800 

CCGAAGA6GA AGAAGGCAA6 CACATCCACC TGGGCCACCT 6GGCTTACAG AGGACACTGC 1860 

30 ACAGCATCGA TCTGOAOATC CAASAGGGTA AACTG6TTGG AATCTGG6GC AGTGTGGOAA 1920 

GTGGAAAAAC CTCTCTCATT TCAQCCATTT TAGGCCRGAT GACGCTTCTA GAGGGCAGCA 1980 

TTGCAATCAG TGQAACCTTC GCTTATGTGG CCCAGCAGGC CTGGATCCTC AATGCTACTC 2040 

TQAGAGACAA CATCCTGTTT GGGAAGGAAT ATGATGAAGA AAGATACAAC TCTGTGCTGA 2100 

ACAGCTGCTG CCTGAG6CCT GACCTGGCCA TTCTTCCCAG CAGCGACCTG ACGGAGATTG 2160 

35 QAQftOGGAQO AGCCAACCTO AGCGGTG6GC AGCQGCAGAQ GATCAGCCTT GCCOGGGGCT 2220 

TGTATAGTOA CAGGA6CATC TACATCCTGG ACGACXXXXrT CAGrGCXTTTA GATGCCXZATG 2280 

TGGGCAACCA CATCTTCAAT AGTGCTATCC GGAAACATCT CAAGTCCAAG ACAOTTCTGT 2340 

TTGTTACCCA CCAGTTACAG TACCTGGTTG ACTGTGATGA AGTGATCTTC ATGAAAGAGG 2400 

GCTGTATTAC G6AAAGAGGC ACCCATGA66 AACT6ATGAA TTTAAATGGT GACTATGCTA 2460 

40 CCATTTTTAA TAACCTGTTG CTaGGAGAGA CACCGCCAOT TOAGATCAAT TCAAAAAAGG 2520 

AAACCAGTGG TTCACAC3AAG AAGTCACAAG ACAAGGGTCC TAAAACAGGA TCAGTAAAGA 2580 

AGGAAAAA6C AOTAAAGCCA GAGGAAGGGC AGCTTGTGCA GCTGGAAGAG AAAGGGCAGG 2640 

GTTCAGTGCC CTGGTCAGTA TATGGTGTCT ACATCCAGGC TGCTGGGGGC CCCTTGGCAT 2700 

TCCTGGTTAT TATGGCCCTT TTCATGCTGA ATGTAGGCAG CACCGCCTTC AGCACCTGGT 2760 

45 GGTTaAGTTA CTGOATCAAG CAAGGAAGGG GGAACACCftC TQTGACTOSA GGGAACQAOA 2820 

CCTCGGTQAG TOACAOCATG AAGGACAATC CTCATAT6CA GTACTATGCC AGCATCTAGG 2860 

CCCTCTCXar 6GCAGTCAT6 CTGATCCTGA AAGCCATTCG AGGAGrTTGTC TTTOTCAAGG 2940 

GCAOGCTGOG AGCTTCCTCC CGGCTGCAT6 AOGAGCTTTT CC3GAAGGATC CTTCGAAGCC 3000 

CTATGAAGTT TTTTGACACG ACCCXTCACAG GGAGGATTCT CAACAGGTTT TCCAAAGACA 3060 

50 TGGATGAAOT TGAC9GTGCG6 CTGOOGTTCX: AGGOGGAGAT GTTCATCCAG AAC6TTATCC 3120 

TG6TGTTCTT CTGTGTGGGA ATGATCGC3VG GA6TCTTCCC GTGGTTCCTT 6TGGCAGTGG 3160 

GGCCCCTTGT CATCCTCTTT TCAGTCCTGC ACATTGTCTC CAGGGTCCTG ATTCGGGAGC 3240 

TGAAGCGTCT GGACAATATC ACGCAGTCAC CTTTCCTCTC CXACATCACG TCCAGCATAC 3300 

AGGGCCTTGC CACCATCCAC GCCTACaATA AAGGGCAGGA GTTTCTGCAC AGATACCAGG 3360 

55 AGCIGCTGGA T6ACAACCAA 6CTCCTTTTT TTTTGTTTAC GT6TGC6ATG CXSGTGGCTGG 3420 

CTGTOO GG CT OQACCTCATC AGCATOGCCC TCATCACCAC CAOGGGGCTG ATGATCGTTC 3480 

TTATGCACOG GCAGATTCCC CCAGCCTATG C6GGTCTCGC CATCTCTTAT GCTGTCCAGT 3540 

TAAOGQGGCT GTTCCAGTTT ACGGTCAGAC TGGCATCTGA GACAGAAOCT CQATTCACCT 3600 

CGGTGGAGAG GATCAATCAC TACATTAAGA CTCTGTCCTT GGAAGCACCT GCCAQAATTA 3660 

60 AGAAC3UIGGC TCCCTCCCCT GACTGGCCCC AGGAGGGAGA GGTGACCTTT GAGAACGCAG 3720 

AGATGAGGTA CCGAGAAAAC CTCCCTCTTG TCCTAAAGAA A6TATCCTTC ACGATCAAAC" 3780 

CTAAAGAGAA GATTGGCATT GT6GGGC2G6A CAGGATCAGG GAAGTCCTOS CTGGGGATGG 3840 

CCCTCTTCCG TCTGGTGGAG TTATCTGGA6 GCTGCATCAA GATTGATGGA GTGAGAATCA .3900 

GTGATATTGG CCTTGCCGAC CTCCGAAGCA AACTCTCTAT CATTCCTCAA 6AGCCGGTGC 3960 

65 T6TTCAGTGG CACTQTCAQA TCAAATTTGQ ACCCCTTCAA CCAGTACACt GAAGACCAGA 4020 

TTTGGGATGC CXTTGGAGAGG ACACACATGA AAGAATGTAT TGCTCAGCTA CCTCTGAAAC 4080 

TTGAATCTGA AGTGATGGAG AATGGGGATA ACTTCTCAGT GGGGGAA0G6 CAGCTCTT6T 4140 

GCATAGCTAG AGCCCT6CTC GGCCACTGTA AQATTCTGAT TTTAGATGAA GCCACAGCTG 4200 

CCATGGACAC AGAGACAGAC TTATTGATTC AA6AGACCAT CCGAGAA6CA TTTGCAGACT 4260 

70 6TACCATGCT GACCATTGCC CATCXSCCTGC ACACGGTTCT AGGCTCCGAT AGGATTATGG 4320 

TGCTGGCCCA GGGACAGGTG GTGGAGTTTG ACACCCXATC GGTCCTTCTG TCCAACGACA 4380 

GTTCX;CX3ATT CTATGCCATG TTTGCTGCTG CAGA6AACAA GGT06CTGTC AA6G6CTGAC 4440 

TOCTCOCIGT TGACOAAGTC TCTTTTCTTT AGAGCATT6C CATTCOCIGC CT6GG6CX3GS 4500 

CCGCTCAT06 06TCCTCCTA CCX3AAACCTT GCCTTTCTC6 ATTTTATCTT TC6CACAGCA 4560 

/5 GTTCCGGATT GGCTTGTGTG TTTCACTTTT AGGGAGA6TC ATATTTTGAT TATT6TATTT 4620 

ATTCXATATT CATGTAAACA AAATTTAGTT TTTGTTCTTA ATTGCACTCT AAAAGGTTCA 4680 

GGGAACCGTT ATTATAATTG TATCAGAGGC CTATAATGAA GCTTTATACG TGTAGCTATA 4740 

TCTATATATA ATTCTGTACA TAGGCTATAT TTACAOTGAA AATQTAAOCT GTTTATTTTA 4800 

TATTAAAATA AGCACTGTGC TAATAACAGT- GCATATTCKT TTCTATCATT TTTGTACAGT 4860 

80 TTGCTQTACT AQAGATCTGG TTTTGCTATT AGACTGTAGG AAGAGTAGCA TTTCATTCTT 4920 

CTCTAGCTGG TGGTTTCAOG GTGCCAGGTT TTCTGGGTGT CCAAAG6AAG ACGTGTGGCA 4980 

ATAGTGGGCC CTCCXACAGC CCCCTCTGCC GCCTCCCCAC AGCXX3CTCCA GGGGTGGCTG 5040 

aAG3^£X3GGT6 GGGGGCTOGA GACCATGCAG AGGGCGGTGA GTTCTCAGG6 CTCCTGCCTT 5100 

CTGTCCTGOT GTCACTTACT GTTTCTGTCA GGAQAGCAGC GGG6CGAA6C CXZAGGGCCCT 5160 

85 TTTCACTCCC TCCATCAAGA ATGGGGATCA CAGAGACATT CCTCCGAGCC GGGGAGTTTC 522 0 

TTTCCTOCCT TCTTCTTTTT GCTGTTGTTT CTAAACAAGA ATCAGTCTAT CCACAGAGAG 5280 

TCCCACTGCC TCAGGTTCCT ATGGCTGGCC ACTGCACAGA GCTCTCCAQC TCCAAGACCT 5340 



PCT/US02/12476 



265 



wo 02/086443 

GTTGQTTCCA AGCCCTGGAG CCAACTGCTG CTTTTTGAGG TGGCACTTTT TCATT TGCCT 5400 

ArpCCCACAC CTCCACAGTT CAQTGGCAG6 GCTCAGGATT TCGTGGOTCT GTTTTCCTTT 5460 

CTCACOGCAG TOSTCGCACA OTCTCTCTCT CTCTCTCCCC TCS^AAGTCTG CAACTTTAAG 5520 

CAOCTCTTGC TAATCAGTGT CTCACACTGG CGTAGAAQTT TTrGTACTGT AAAQAGACCT S580 

ACCTCAGGTT GCTGGTTGCT GTGTGGTTTG QTGTQTTCCC GOkAACCCOC TTTGTGCTGT 5640 

GQGGCTGGTA GCTCAGGTGG GCGTGGTCAC TaCTOTCATC AGTTGAATGQ TCAGOGTTGC 5700 

ATOTCOIBAC CAACTAGACA TTCTGTOQCC TTAGCATOXT TGCTOAACAC CTTGTGGAAQ 5760 

CftAAAATCTG AAAATGT6AA TAAAATTATT TTaOATTTTG TAAAAAAAAA AAAAAAAAAA 5820 
AAAAAAAAAA AAAAAAAA 

Seq ID NO: 194 Protein sequence: 
Protein Accession #t NP_005679.1 



1 11 21 31 41 51 

LdIDIGKOT ilPSPGYRSV RERTSTSGTH RDREDSKPRR TRPLBCQDAL ETAARABGLS 60 

LDASMHSQIiR ILDEEHPKGK YHHGLSALKP IRTTSKHQHP VDMAGLPSCM TPSWLSSLAR 120 

VAHKKGELSM EDVWSLSKHE SSDVNCRRLE RliWQEELHBV QPDAASLRRV VWIPCRTRLI 180 

LSIVCLMITQ LAGPSGPAFM VKHLLEYTQA TBSHLQVSLL LVLGliLLTBl VRSWSLALTW 240 

AUJVRTGVRL RGAILTMAPK KIUCLKNIKE KSLGBLIKIC SNDGQRMPBA AAVOSLLAfiO 300 

PWAlWailV NVIILGPTGP LGSAVFII.FY PAMMPASRIiT AYPRRKCVAA TDERVQKMNE 360 

VLTYIKFIKM YAMVKAPSQS VQKIREEERR ILSKAGYPQG ITVGVAPIW VIASWTFSV 420 

HMTLGPDLTA AQAPTWTVF NSMTFALKVT PPSVKSLSBA SVAVDRPKSL PLMBBVHMIK 480 

NKPASPHIKI EMKNATLAWO SSBSSIQNSP KLTPKMKKDK RASRGKKEKV RQLQRTEHQA 540 

VLABQKGHLL U)SDERPSPE EBEGKHIHLG HLRLQRTLHS IDLBIQHGKI. VGIOGSVGSG 600 

KTSLISAIIiG QMTLLEGSIA ISGTFAYVAQ QAWILNATLR DNILFGKEYP BBRVNSVLMS 660 

CCLRPDLAIL PSSDLTEIGE RGANLSGGQR QRISLARALY SDRSIYILDD PLSALDAHVG 720 

NHIFNSAIBK HLKSKTVLPV THQLQYLVDC DBVIPMKEGC ITERGTHEEIi MNLNGDYATI 780 

PNNIjLLGBTP PVEINSKKET SGSQKKSQWC GPKTOSVKKB XAVKPBBGQI. VQLBBKGQGS 840 

VPWSVYGVYI QAAGGPLAPL VIMALFMUIV GSTAFSTWWL SYWIKQGSOl TTVTRGNETS 900 

VSDSMKDNPH MQYYASIYAI. SMAVMLILKA IRGWFVKGT LRASSRLHDE LFRRILRSPM 960 

KPPDTTPTGR ILNRPSKDMD EVDVRLPPQA EMPIQNVILV FPCVGMIAGV FPWFLVAVGP 1020 

LVILPSVLHI VSRVLIREIiK RIiDNITQSPF LfifllTSSIQG lATIHAYNKG QEPIiHRYQEL 10 BO 

LDDNQAPFPL FTCRMRWIAV RLDLISIALI TTTGLMIVLM HGQIPPAYAG lAlSYAVQLT 1140 

GLFQFTVRLA SETEARPTSV ERINHYIKTL SLBAPARIKH KAPSTOWPQE GBWTFEHAEM 1200 

RYRBNLPLVL KKVSFTIKPK EKIGIVGRTG SQKSSIiQMAL FR1.VEI»SGGC IKIDGVRISD 1260 

IGLADLRSKL SIIPQEPVLP SGTVRSNLDP FNQYTH3QIW DALERTHMKE CIAQLPLKI.B 1320 

SEVMEIK2MIP SVGERQLIiCI ARALLRHCKI LItDBATAAM DTETDLLIQB TIREAPADCT 1380 
MLTIAHKLHT VLGSDRXKVIi AQQQWEFDT PSVLLSNDSS RFYAMFAAAB NKVAVRG 

Seq ID NO: 195 DNA sequence 
Nucleic Acid AccesBion NM^006470 
Coding seqpiencei 228.. 1922 



1 11 21 31 ■ 41 51 

GCTGTCCTGA GCCTGAGTAC TCTAGCTGCC TTGTC6CCAT OGCATCTGGC TGCCATCCAG 60 

CGCCAGCACA CAGTAATGAO TOGCCSSftGCT TCCTCTQGGA GGGAGGAAAC AGTTAAAATC 120 

TTOCAGCAGC TGCAATCATC TAG6CGTGGT TCTCTTCTCT GACTTGGGCT GCACAOATCC 180 

TGGGCCAAGG GACAGAAGAA AGACAGCCTA OGAGCAGAQC CTCCCAGATQ GCTGAGTTGG 240 

ATCTAATGGC TCCA6GGCCA CTGCCCAGGG CCACTGCTCA GCCCCCAGCC CCTCTCA6CC 300 

CAGACrCTGG GTCACXXAGC CCAGATTCTG GGTCAGCCAG CCCAGTGGAA GAftGAQGACXS 360 

TGGGCroCTC GGAGAAGCTT QQCAGGGASA GGGAGGAACA GGACAGC6AC TCTGCA6AGC 420 

AGG66GATCC TGCTGGTGAQ GGGAAAGAGG TCCTGTGTGA CTTCTGCCTT GATGACACCA 480 

GAAGAGTGAA GGCAGTGAAG TCCTGTCTAA CXnXSCATGGT QAATTACTGr GAAGAQCACT 540 

TGCAGCCGCA TCAGGTGAAC ATCAAACTGC AAAGCXIACCT GCTQACGGA6 CSGAOTGAAQG 600 

ACCACAACTG GCGATACTCC CCTGCCCACC ACA6CCCACT GTCTGCTTTC TGCTGCCCTO 660 

ATCAGCAGTG CATCTGCCAO SACTGTTGCC AGOAGCACAG TGGCCACACC ATAGTCTCCC 720 

TG6ATGCAGC CCGCAGGOAC AAGGAGGCTG AACTCCAGTG CACCCAGTTA GACTTGGAGC 780 

GGAAACTCAA GTTGAATGAA AATGCCATCT CCAGGCTCCA GGCTAACCAA AAGTCTGTTC 840 

TGGTGTCGGT GTCAGAGGTC AAAGOGGTGG CTGAAATGCA GTTTGGOOAA CTCCTTGCTG 900 

CTOTGAGGAA GGCCCAGGCC AATGTGATGC TCTTCTTAGA GGAflAAGGAG CAAGCT6CGC 960 

TGA6CCAGGC CAACGGTATC AAGGCOCACC TGGAGTACAQ GAGTGCCGAG ATQGAGAAGA 1020 

GCAA6CAGGA GCTGGAQAQQ AT6GCGGCCA TCAGCAACAC TGTCCAGTTC TTGGAGGAGT 1080 

ACT6CAAGTT TAAQAACACT GAA6ACATCA CCTTCCCTAG T6TTTACGTA GGGCTGAAGG 1140 

ATAAACTCTC GGGCATCCGC AAAGTTATCA CXSGAATCCAC TQTACACTTA ATCCAQTTGC 1200 

TGGAGAACTA TAAGAAAAAG CTCCAGGAGT TTTCCAAGGA AGAGQAGTAT GACATCAGAA 1260 

CTCAAGTCTC TCCOGTTGTT CAGCGCAAAT ATTGGACTTC CAAACCTGAG CCCAGCACCA 1320 

GGQAACAGIT CCTCCAATAT 6CGTAT6ACA TCACGTTTGA CCCGGACACA GCACACAAGT 1380 

ATCrCCGGCT GCRGGAGGAG AACOGCAAGG TCAGCAACAC CAC6CCCTGG GAGCATCOCT 1440 

ACCCGGACCT CCCCAGCAGQ TTCCTGCACT GGC3GGCAGGT GCTGTCCCAG CAGAOTCTOT ISOO 

ACCTGCACAG GTACTATTTT GAGGTGGAGA TCTTOGGGGC A66CACCTAT GTTGOCCTOA 1560 

CCTGCAAAGG CATCGACCGG AAAGGGGAGG AGCQCAACAG TTGCATTTCC GGAAACAACT 1620 

TCTCCTGGAQ CCTCCAATGG AACGGGAAGQ AGTTCACGGC CTGGTACAOT GACATGGAGA 1680 

CCCCACTCAA AGCTCGCCCT TTCCGGAGQC TCGGGGTCTA TATCGACTTC CCGG GAGGGA 1740 

TCCTTTCCTT CTATGGCGTA GAGTATGATA CCATGACTCT GGTTCACAAG TTTGCCTGCA 1800 

AATTTTCAGA ACCAGTCTAT GCTGCCTPCT GGCTTTCCAA GAAOGAAAAC GCCATCOGGA 1860 

TTGTAGATCT GGGAQAGQAA CC0GA6AAGC CAGCACCGTC CTTGGGGGTG ACTGCTCCCT 1920 

AGACTCCAGG AGCCATATCC CAGACCTTTG CCAGCTACAQ TQATGGGATT TGCATTITAG 1980 

GQTGATTTGT GGGCAGAAAT AACIGCTGAT GGTAGCTGGC TTTTGAAATC CTATGG GGTC 2040 

TCTGAATGAA AACATTCTOC AGCTGCTCTC rtTPQCTCCA TATGCTQCTG TTCTCIATGT 2100 

GTTTGCAGTA ATTCTTTTTT TTTTTTTTGA QAOGGAGTCT CGCACTOTTQ CCCAGGCTGG 2160 

AGAGCAGTGG CGCGATCTTG GCTCACTCCA AGCTCCGCCT CCCGA6TTCA AGCAATTCTC 2220 

CTGCCTCAGC CTCCOGAGTA GCTGGGATTA CAGGTGCCTG CCACCACACC CAGCTAATGT 2280 

TTTGTATm TAGTAGA6AT GGGGTTTCAC CATGTTGGCC AGGCAGATCT CAAACTCCTG 2340 
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5 
10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



ACCTOGT621T OGACCCSVCCT CGGCCTCCCA AAGTGCTGGG ATTACAT6C6 TOAGCCACTG 
CGCCCTGCCT GTTTGTAGTA ATTTTTAGGC ACCRAATCTC CCTCRTCTTC TAGTGCCATT 
CTCCTCTCTG TTCAGGTAAA TGTCACACT6 TGCCCAGAAT GOATGACCAO GAACCTTAAA 
GAGTG6CTGA AAA6ATTGCA GASTTATCAT AATAAATT6C TAACTTQG3T 



Seq ZD NO I 196 Protein sequences 
Protein Accession #: NP_006461 



2400 
2460 
2520 



1 
I 

KAELDLHAPG 
DSAEQGDPAO 
EPVKDBHWRY 
LDItBRKLKLN 
EQAAIiSQANG 
VGLKDKLSGI 
EPSTREQEldQ 
QQSLYLHRYY 
SDMBTPLKAG 
NAIRIVDLGE 



11 
I 

PItPRATAQPP 
EGKEVIfCDFC 
CPAHH8PLSA 
ENAISRLQAN 
IKftHLEYRSA 
RKVITESTVH 
YAYDltFDPD 
PBVEIPGAGT 
PPRRLGVYID 
BPEKPAPSIX3 



21 

I 

APItSPDSGSP 
LDDTRRVKAV 
PCCPDQQCIC 
QKSVLVSVSE 
BHEKSKQELE 
LIQLLENyXK 
TARKyUUiQE 
YVGLTCKGID 
FPGGIZiSFYG 
VTAP 



31 
I 

SPDS6SASPV 
KSCLTCMVNY 

QDCOQERSGK 
VKAVABMQFG 
RMAAISNTVQ 



41 



EMRKvnmp 

RKGEERNSCZ 
VEYDTNTLVH 



CEEHLQPHQV 
TIVSLDAARR 
BtiLAAVRKAQ 
FLBEYCKFKN 
YDIRTQVSAV 
NEHPyPDZiPS 
SCaiNFSHSLQ 
KFACKFSEPV 



Seq ID NO: 197 DNA sec[uence 
Nucleic Acid Accession #: NM_004316 
coding sequence: 433-1149 " 



CCCGAGACCC 
GGGG6TTCAG 
GG6CCAGCG6 
GGAG6AGGGG 
TTGCTCXrCAC 
GCTCC06CTT 
GTGCCCCTC39 
CC6CC6CTGC 
CAGCCGCAGC 
GCCGCGGC6G 
CAGCAGCAGC 
TCAG6G6GCG 
GAACTGATGC 
CAGCA6CCGG 
AACCTGGGCT 
AGTAAGGTG6 
GAG6AGCAT6 
CCCAACTACT 
GACGAGGGCT 
TGGTTCTQAG 
ATC6CACAAC 
AAAAGAAAAA 
CCAACCCCAT 
AGCGCTCAGA 
ACCTGAGTCA 
GAGCftGCACA 
GCTCGGGTCC 
6AGTTBGTGT 



11 
I 

GGGGCAAGAG 
CACTGACTTT 
CAGCCTCACA 
A6G6AGGAGG 
TCTAAGAAGT 
CATATTTCCT 
CGGGCCCCaC 
6CATGGAAA0 
CCCAGCAGCC 
CCGCAGCCGC 
A6CA6CAGCA 
GTCACAAGTC 
QCTGCAAACG 
CCX3CCGTQ6C 
TTGCCACCCT 
AGACACTGCG 
AOGOOGTOAG 
CX3UVGQACTT 
CTTACGACCC 
GGGCTCGGCC 
CTGCATCTTT 
AAAAGAAGAA 
GGCCAACTAA 
ACAGTATCTT 
ATGCGCAAAA 
CX3CQTTATAG 
CTTCACCTCC 
CTTTC 



21 

1 

AGCGCA6CCT 
TOCTGCTGCT 
CX^CGAGCGCC 
AGGOGGCGTG 
CTCCOJGGGA 
TTTCTTTCCC 
ACCTCQGGTC 
CTCTGCCAAG 
CTTCCTGCCG 
CGCAGCGGCA 
GCAGGCGCCG 
AGGGCCCAAG 
0CG6CTCAAC 
GC6CX36CAAC 
TCGQGAGCAC 
CTCGGCGGTC 
GQCOGCCTTC 
GAACTCCATG 
GCTCAGCCCC 
TGGTCAGGCC 
AGTGCTTTCT 
GAABAAGAAA 
GOGAGGCATG 
TGCACTCCAA 
TGCAGCTTGT 
TAACTCCCAT 
CCGCCCTTTC 



31 

1 

TA6TAGGA0A 
TCTGCTTTTT 
ACQCQAGGCT 
CAGGGAGGA6 
TTTTGTArAT 
TCTCTGTTCC 
C06GATCGCT 
ATG6AGAGCG 
CCCGCAGCCT 
GCGCAGA6CG 
CAGCTGAGAC 
CAAGTCAAQC 
TTCAQGOGCr 
GAGGGGQAGC 
GTCCCCAACG 
6AGTACATCC 
CAGGCA6GG0 
GCX38GCTCGC 
GAGGAGCAGG 
CTGGXGa3AA 
T6TCAGTGGC 
A6AGAA6AAG 
CCTGAGAGAC 
TCATTCAOGG 
GTGCAAAAGC 
CACCTCTAAC 
TTAGAGTGCA 



41 
1 

GGAACGCX3A6 
TTTTTCTTAG 
CCCGAA6CCA 
AAAAAGCATT 
ATTTTTTAAC 
TGCACCCAAG 
CTGATTCCGC 
GGGG06CCGG 
GTTTCTTTGC 
G6CAGCA6CA 
OGGCGGCGGA 
GACAGGGCTC 
TTGGCTACAG 
GCAACCGCX3T 
GCGCGGCCAA 
GG6CGCT6CA 
TCCTGTCGCC 
OGGTCTCATC 
AGCTTCTCGA 
TGGACTTTGG 
GTTGGGAGGG 
AAAAAAA06A 
ATGGCTTTCA 
AGATATGAAG 
AGTGGGCTCC 
AC6CACA6CT 
GTTCTTAGCC 



51 
I 

IiGRSTEEQDS 
NIKLQSHLLT 
DKEAELQCTQ 
ANVMLFLEEK 
TEDITFPSVY 
VQRKYHTSKP 
RFLHWRQVIiS 
HNGKEFTAWY 
YAAFWIiSKKB 



51 

1 

ACGOGGCAGA 
AAACAAGAAG 
ACC06CGAAG 
TTCA<XTTTT 
TTOCGTCAGG 
TTCTCTCTGT 
GACTCCTTG6 
CXrAGCAGCCC 
CACGGCCX3CA 
GCAGCAGCAG 
GGGCCAGCCC 
6TCTTCGCCC 
CCTGCCGCAG 
CAAGTTGGTC 
CAAGAAGATG 
GCAGCTGCTG 
CACCATCTCC 
CTACTCQTC3G 
CTTCACCAAC 
AAGCAGQGTG 
GGA6AAAAGG 
AAACAOTCAA 
QAAAAOGGQA 
AGCAACTGGG 
TGGCAGAAGG 
GAAAGTTCTT 
CTCTAGAAAC 



Seq ID NO: 198 Protein sequence; 
Protein Accession #: NP 004307 



51 



1 11 21 31 41 

I I t I I I 

MBSSAKMES6 GAGQQPQPQP QQPFLPPAAC FFATAAAAAA AAAAAAAQSA QQQQQQQQQQ 
QQQQAPQLRP AADGQPSGOG HKSAPKQVKR QRSSSPELMR CKRRIiNFSGF GYSLPQQQPA 
AVARRNERER NRVKLVNLGF ATLREHVPNG AANKKMSKVE TLRSAVEYIR ALQQLLDHHD 
AVSAAFQAGV LSPTISPNYS NDLNSMAGSP VSSYSSDE6S YDFLSPEEQE LLDFTHWF 

Seq ID NOt 199 DNA sequence 
Nucleic Acid Accession «: NM_D07015 
Coding sequences 1-1005 



1 
1 

ATGACA6AGA 
TGCAGCCCCC 
AAGGTGGGAG 
GCCTTCTACT 
ATCAATGG6A 
TTTAAAAT6G 
ACAGGAATTC 
ATTCCTGAGG 
AT6CCAQTCA 
GACAACAGCT 
CTTAAACCAA 
GTTCCAACTA 
CTGAATAATG 



11 

1 

ACTCCGACAA 
CGGCX5TACGC 
COGTGGTCCT 
TCTGGAAGGG 
AACTACAAGA 
GAAGTGGAGC 
GTTTTGCTGG 
TGGGOSCOGT 
AATATGAAGA 
TCTTGAGTTC 
CCTATCCAAA 
CCACAAAAAG 
AARCCAGACC 



21 

I 

AGTTCCCATT 
TACGCTGACG 
CATTTCGGGA 
GAGCQACAGT 
TGGGTCAATG 
TGAA6AAGCA 
AGGAGAGAAG 
6ACCAAACAG 
AAATTCTCTT 
TAAGGTGTTA 
AGAAATCCAG 
ACCACACAGT 
CAGTGTTCAA 



31 
I 

GCCCTGGTGG 
GTGAAGCCCT 
GCTGTGCTGC 
CACATTTACA 
GAAA7AGAC6 
ATTGCAGTTA 
TGCTACATTA 
AGCATCTCCT 
ATCTGGGTGG 
6AACTCTG0G 
AGGGAAAGAA 
GGACCACGGA 
GAGGACTCAC 



41 
I 

GACCTGATGA 
CCAGCCCCGC 
TGCTCTTTGG 
ATGTCCATTA 
CTGGGAACAA 
ATGATTTCCA 
AAG06CAAGT 
CCAAACTGGA 
CT6TAGATCA 
GTGAOCTTCC 
QA6AAGTGGT 
GCAACCCAGG 
AAGCCTTCAA 



51 
I 

GGTGGAATTC 
GCGGCTGCTC 
GGCCATCGGG 
CACCAT6AGT 
CTT6GAGACC 
GAATGGCATC 
GAAGGCrCGT 
AGGCAAGATC 
6CCTGTGAA0 
TATTTTCTGQ 
AAGAAAAATT 
CGCTGGAAGA 
TCCTGATAAT 



60 
120 
160 
240 
300 
360 
420 
480 
540 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 



60 
120 
180 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 



267 



wo 02/086443 

CCTTATCRTC AGCaCGAAGG GGAAAGCATG 
ATCTGTTSTA TAGAATGTAO 6CSGAGCTAC 
6GGOC3CTATT ACCCATGGCC TTATAATTAT 
ATGCCATGTA GCTGGTGGGT GGCCCGTATC 
CACGTGCTGT AAAATAAGAA CTA GCTG AAG 
ATGCTGATGG GACCATAAAA TATTT TTACA 
TAACAQAATT TTTTTAATCQ TTTTCCAGAA 
AGTTCMGTC TAAAATOCCA TA ACCC CGTT 
CATAA6TCTT CCCTTGCTTG CATCTTCCAA 
AGTTTGCC 



ACATTCGACC 
ACCCACTGCC 
CAAGGCTGCC 
TTGGOCATGO 
AGACAACCAA 
CX3CAGCCTGA 
CTTTA6TATA 
ATTTGTTATT 
AGCTATTTCO 



CTAGACTGOA 
AGAAGATCTG 
6TTCGGCCTG 
T6TGAAATCA 
AGAAGCATTA 
GCX3GTTATTC 
TGCAAATGCA 
TTTTATTTGC 
AAATAAACAC 



TCAOGAASGA 
TGAACCCCTG 
CAGAGTCATC 
CTTCATATAT 
A6GCAC30TTG 
TTGACACTCT 
CTGAAAGGGT 
ATTGATTTGC 
6AAAATTTAC 



B40 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 



Seq ID VOt 200 Protein sequence: 
Protein Accession #t NP_008946 



MTEMTSDRVPI 
AFYPWKGSDS 
TGIRFAGGBK 
DNSFLSSKVL 
LNNETRPSVQ 
GGYYPHPniY 



11 

1 

ALVGPDDVBF 
HIYNVHYTMS 
CyiKAQVKAR 
ELCXaSLPIFW 
EDSQAFNFDN 
Q6CRSACRVI 



21 



31 41 51 

i i I 

CSPPAYATLT VKPSSPARLIj KVOAWLISG AVLLLFGAIG 
INGKLQDGSM EIDAGNNLET FKM6SGAEEA lAVNDFQHGI 
IPEVGAVTKQ SISSKLECKI MPVKYEEHSL IWVAVDQPVK 
LKPTYPKEIQ RERREWRKI VPTTTKHPHS GPRSNP Gfl^ 
PYHQQEGESM TFDPRliDHEG ICCIECRRSY THOQKICEPL 
MPCSWWVARI LGHV 



Seq ID NO: 201 DNA sequence 

Nucleic Acid Accession #i NM_000728.2 

Coding sequence I 112.. 4 95 



I 

GTAATAAGAG 
GTG6ACCG6C 
CX3GAAGTTCT 
CAGGGGGGGC 
GAGGAOGCGC 
GAGCTGAAGC 
AACACTGCCA 
GTGAAGAGCA 
QACCTTCAAG 
CATATCCTIA 
AAG6AGGCAC 
TQ6AAGAAGA 
GAGAATAATT 
GGAAACTAAT 
GQTTATTTGG 
ACTGTACCAC 
GTATGTAGCA 
AGCATCTATT 
AAATTTTTGT 
TGGATGCAA6 
TTGCTTTTTC 
TCCAATTCAT 
TTGCCTAACT 
TATAGTTTTA 
TSAGAGGTGT 
TTTGTTAAAA 
CTGATCATAT 
ACCATTCTTT 
ACCAGATAAT 
CTAAAATATT 
TTCTGATGAG 
CATATTAATA 
GTTTTCTCTG 
ATATCTTGTT 
ATTTTTOTTT 
AAAAAAAAAA 



11 
I 

CGGGGTCTCC 
OOCTCGCGCT 
CCCCCTTCCT 
CATTCAGGTC 
GCCTCCTQCT 
AGGAGCA6GA 
CCTGTGTGAC 
ACTTCGTQCC 
CCTGAGCAGA 
TAA0AGATTC 
AAGCCAA6GA 
GCAGCCCTGC 
TCTGTTGTTT 
ACAATACATT 
AAAGTGTGTA 
TTOQCCTTCT 
GTATCTCATT 
TTACCATATG 
TGGCTTGCTT 
ATTGTTTTCA 
ATTTTCTTAG 
CTTTTTTTTT 
AAGGTCCCAA 
TATTTTAIAT 
A8QTTGAAAT 
AGACTQTTAT 
TTGTGTGGGT 

tgccaatgtc 
6tgggtctac 
ttctacatct 

ATTTTTAATG 
ATATTAAGTC 
TTTTTTTTTT 
AGATTTTTAA 
TTAATTGTTC 
AAAAAAAAAA 



21 
I 

GCX3GGGAAG0 
GCCCTGAAAC 
GGCTCTCAGT 
TGCCCTGGAG 
OGCTGCACTG 
6ACACAC3GGC 
TCATCGGCT6 
CACCAATGTG 
T6AATGACTC 
ACT CASAflfl A 
AGTCTGT6TC 
TGACACCTAG 
TAAOCCACAA 
TTCATTTATT 
TTTAACTCTO 
TGCCAGCCAC 
GCTGTTTTAA 
TTTATCACCT 
GCTTTATTAG 
GATATATAGT 
CAQTGTCTCT 
CTTTTATGTA 
GGTCACAATA 
GTACSITTAGT 
TCATACCTGT 
TTCACCATTT 
ATATrrCTGG 
ATACTGCCTT 
CAACATTGTT 
TTTATACATT 
GGATTGTGTT 
GTTCAATTCA 
TTTAACAGTG 
CTATTTTATT 
ATTGCTAGTA 
AAAAAAAAAA 



31 
I 

G8CCCACAGC 
TCTAGTCGCC 
ATCTTGGTCC 
AGCA6CCCA6 
GTGGAaaACT 
TCCAGCTCCO 
GCAGGCTTGC 
GGTTCCAAA8 
CAG6AAGAA6 
CACATGTGGA 
T ACCAG AAGC 
W3TTTGGACT 
AGTTTGTGGT 
TTGGGTAAAT 
TAAOAAACTG 
ATATGAOAGC 
TTTGTATTTC 
TTATTGAAGG 
TGTTGAGTTT 
TTGGAAACTT 
CACA6AGAAA 
TTGTGCTTTT 
ACCTTATTCT 
GATCTATTTT 
GAATATAGAT 
AATTGCCCCT 
GTTCTCAATT 
GATTAGTGTA 
CATTCTTGTT 
TTAJQAATCAG 
AAATCA6TGG 
T6AACACAAT 
TTCTCAGTTT 
TTTTGGTQCT 
GATA6AAATA 



41 
I 

AGGTGTGGT6 
AGAQAGGCG6 
TGTACCAGGC 
ACCC66CCAC 
ATCrrOCAGAT 
CTGCCGAGAA 
TGAGCAGATC 
CCTTTGGCAG 
GTGTGTCCTA 
QAA6GTQACA 
CAGAATCACA 
TCCAGCTTCC 
AATTTGTTAT 
GCCTTGGAGT 
CCAAACTATT 
TCTAGTATTT 
CCC3ATGACT 
GTCTGTTTAA 
TTAGAGCTCT 
CCTTCCCCTG 
AAGTTGTAAT 
AGTTCATGTC 
ATACTTTCTT 
GAGTTAATTT 
ACCCAATTGT 
GCACCTTTGT 
CTGTCTCATT 
GTGTTAAAGT 
CAAAAAGATT 
TGTGTTACTA 
GTTAATTTTG 
ACATGTTTTC 
TCAACAOAAA 
AATGTAAATG 
CAATATTTAA 



51 
1 

TTCATCCC6G 
CATGGGTTTC 
GGGCAGCCTC 
ACTCA6TAAA 
QAAOGCCAGT 
GA6A0CCTGC 
AGGGGGCATG 
GCX3C0GCAGG 
AATCCAATGA 
TQACAGAGGC 
GAACAGTCTC 
AGAACTGTGA 
GACAGCX:CTA 
GGGATTGCTG 
TTCTQAAGTG 
CCACAAATAG 
AATGACGTTG 
ATCTTCTGCT 
TTATATGTTG 
AATCTGOSGA 
TTGAATAAGA 
TAAGAACTCT 
GTAAAAGTTT 
TTGTATAAGG 
TTCAGTGCCA 
CAAAAAGCAA 
GATTGATTTQ 
6AATCIGAAA 
TTAGCTACAT 
TCTACAAAAT 
GGAGAATTAG 
ACTTATTTAG 
TATTCTACAC 
GTACTTAAAC 
AATATTAGGA 



60 
120 
180 
240 
300 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
ISOO 
1560 
1620 
1680 
1740 
1800 
1B60 
1920 
1980 
2040 
2100 



Seq ID NO I 202 Protein sequence: 
Protein Accession #5 NP_000719.1 



11 



21 31 41 51 

MGPRKPSPFL ALSILVIiYQA GSLQAAPFRS ALESSPDPAT LSKBDARIOiL AALVQDYVQM 
KASELKQEQE TQGSSSAAQK RACNTATCVT HRLAOLIiSRS GGMVKSMFVP TNVQSKAPOT 
RRStDLQA 



60 
120 



Seq ID NO: 203 DNA sequence 
Nucleic Acid Accession #j NM_001741 
Coding sequence: 71.. 4 96 

1 11 21 31 41 51 

I ) I I I < 

CTCTCGCTGG ACQCCGCCQC CGC06CTGCC ACCGCCTCTG ATCCAAGCCA CCTCCCGCCA 
GAGAGGTGTC ATGGGCTTCC AAAAGTTCTC OCCCTTCCTO GCTCTCMCA TCTTOGTCCT 
GTTOCAGGCA GGCAGCCTCC ATGCAGCACC ATTCAGGTCT GCCCTGGA6A GCAGCCCAGC 
AGACCCGGCC ACGCTCAGTG AGGACGAAGC GCGCCTCCTG CTGGCTGCAC TGGTGCAGGA 
CTATGTGCAG ATGAAGGCCA GTGAGCTQGA GCAGGAGCAA GAGAGAGAGG GCTCCAGCCT 



60 
120 
180 
240 



268 



10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



WO 02/086443 

GGACAGCCCC AGATCTAAGC GGTGCX3GTAA TCTGAGTACT 
6CAGGACTTC AACAftGTTTC ACAOGTTCCC CCAAACTGOV 
AAAQAAAAGG GATATGTCCA GCGACTTGCA OAOAGACCAT 
CCAGAATGCC AACTAAACTC CTCCCTTTCC TTCCTAATTT 
TAACTTGATG CATGTGGTTT GGTTCCTCTC TGC5TGGCTCT 
TTCCTTGTGG CAGAGGATGT CTCAAACTTC AGATGGGAG6 
GTTQQAAGAO AATCACCTGG GAAAATAOCA GAAAATGAG6 
QATOTCATCA GAOCTCCTCT QTCCTGCTTC TGAATCTGCT 
TAITTTTCCC C 



Seq ID NO: 204 Protein sequences 
Protein Accession # : KP_001732 



PCT/US02/12476 



TGCATGCTGG 
ATTGGQOTTQ 
OOCCCTCATG 
CXXTTCTTQC 
TTGGGCTGGT 
AAAGAGAGCA 
0C06CTTTGA 
(3ATCATTTGA 



gcacatacac 
gagcacctgg 
ttaqcatgcc 

ATCCTTCCTA 
ATTGGTGGCT 
GGACTCACAG 
GTCCCCCAGA 
GGAATAAAAT 



51 



1 11 21 31 41 

i I I.I I 1 

MGFQKFSPFL ALSILVLLQA GSLHAAPFRS AIiESSPADPA TLSEDBARUi LAALVQDYVQ 
MKASELEQEQ EREGSSLDSP RSKRCX3NLST CHLGTYTQDF NKEBTFPQTA IGVGAP6KXR 
DNSSDLERDH RPHVSMPQNA N 

Seq ID NO: 205 DNA sequence 
Nucleic Acid Accession «t NM_0053£l 
Coding sequence: 1-94S " 



ATGCCTCTTG 
GAGGCCCTGG 
TCCTCTTCTA 
CCTCCCCACA 
AGACAATCOG 
CTGGAQTCC6 
CTCCTCAAGT 
AGAAATT6CC 
GTCTTTGGCA 
TGCCTGGGCC 
CTCCTGATAA 
ATCTGGGAGG 
CATCCXAGGA 
GTGOCCGGCA 
ACCAGCTATG 
TACCCACECC 



11 
I 

AGCAGAGGAG 
GCXrrGGTGGG 
CTCTAGTG6A 
GTCCTCAQG6 
ATGAGGGCTC 
AGTTCCAA6C 
ATCGAGCCAG 
AGGACTTCTT 
TCX3AGGT6GT 
TCYCCTAC6A 
TOGTCCTGGC 
AGCTGAGTAT 
AGCTGCTCAT 
GTGATCCTGC 
T6AAAGTCCT 
TGCATGAAOG 



21 
I 

TCAGCACTGC 
TGCGCAGGCT 
AGTTACCCTG 
AGCCTCCA6C 
CA6CAACCAA 
AGCAATCAGT 
GGAGCCGGTC 
TCCG6TGATC 
06AA6TGGTC 
T6GCCTGCTG 
CATAATCGCA 
GTTGGAGGTG 
GCAAGATCTG 
ATGCTAGGAQ 
GCACCATACA 
GQCTTTGAOA 



31 
1 

AAGCCTGAAG 
CCTGCTACTG 
GGGGAGGTGC 
TTCTOGACTA 
GAAGAGGAG6 
AGGAAGATGG 
ACAAAG6CAG 
TTCA6CAAA6 
CCCATCApCC 
GGG6ACAATC 
ATAGAGGGCG 
TTTGAGGGGA 
GTGCAGGAAA 
TTCCTGTGGG 
CTAAAGATCQ 
GAGGQAGAAQ 



41 
I 

AAGGCCTTGA 
AGGAGCAGCA 
CTGCrOCCGA 
CXATCAACTA 
GGCCAAGAAT 
TTGAGTTGGT 
AAATGCTGGA 
CCTCCXSAGTA 
ACTTGTACAT 
AGGTCATGCC 
ACT6TGCCCC 
GGGAGGACAG 
ACTACCTGGA 
GTCCAAGQGC 
GTOGAQAACC 
A0T8A 



51 

1 

GGCCCGAGGA 
GACOGCTTCT 
CTCACCGAGT 
CACTCTTT66 
GTTTCCOSAC 
TCATTTTCTG 
GAGTGTCCTC 
CTTGCAGCTG 
CCXTGTCAGC 
CAAGACA66C 
T6AGGAGAAA 
TGTCTTCGCA 
GTACG6GCAG 
OCTCATTGAA 
TGAO^TTTCC 



Seq ID NO: 206 Protein sequence: 
Protein Accession # i NP_005352 

1 11 



21 31 41 51 

I 11)1 
HPLEQRSQHC KPBEGXiBARG EALGLVGAQA FATEEQQTA8 SSSTLVBVTL GEVPAADSPS 
PPHSPQ6A8S FSTTZNYTIM RQSDEGSSNQ EBB3PRMFPD LB8ES0AAIS RKHVELVHFL 
LUKYRAREFV TKAEHLSSVL BNGQDFFFVI FSKASBYLQL VFGIBWEW PISBLYXLVT 
OiGLSyDGItX* GDNQVMPKTG IiLXXVLAIIA IBSDCftFESK IWBBLSMLEV FEGRED8VFA 
HPRKLLNQDL VQENYLEYHQ VPOSDPACyE FLMGPRALIE TSYVKVLBBT LKIG6EPRXS 
YPPIiHERALR EGEE 

Seq ID NO: 207 I^A sequence 
Nucleic Acid Accession #: NM_021115 
Coding sequence: 743-2893 



AAAGGAAGGG 
GGCACCGCCC 
CCCAAACTAA 
CCCTTTGGGT 
GCACCCTGAA 
GGGOGAGCTG 
ACC6CTQCTT 
TTCGCTCAAG 
CACTGTCCAA 
CACGGASAA6 
AGAA6TGCCC 
GCAAATCTCC 
ACCOBGGGAG 
GGCCCTGATG 
GACCACTACC 
CTGCAGTGTG 
GCCGCTCAAC 
66AGCTCCAG 
GGAOGGCCCT 
CC3GAAGCCCC 
GACCTTCCAG 
CTCIGGG6AT 
CCTGGGCTAT 
CTGGAGCAGC 
CATCGGCCGC 
CTGGACGATT 



11 

1 

AGGGAGGGAG 
TTAGGAGGGC 
CTGGTGTCTT 
CCTTACCTCC 
GAGAGAGTGG 
GTGCTGGATG 
CCAGASQAGG 
CAGGT8AACT 
AGGGCAGGGT 
CCTGGCCCAC 
CTTTGGCTGG 
CCCTTCACTT 
CCTGGGCCTG 
GACAAAGGTG 
TCCACCATTA 
AGCTTCTCCA 
AACTTTGTGG 
GTGAAGAQTG 
ACCCTGACOG 
AGCAACACCA 
CTTCACTACC 
GTCA06GTGA 
GA6CTCCAGG 
CAGGAGCCCA 
GTCCTCTCCC 
GAAGCTCCAG 



21 

1 

AAAGGAGAAG 
CACCCTCAGA 
TTCTCXn*CTT 
TGCCCTCAGG 
TAACAGCX3CC 
GGACCXSGACC 
CCOQCGCCftA 
CTGCCAGGAA 
CCCAGCCAQC 
CG6GGGACCC 
ACOGAAAGGA 
CGCAGCCCZA 
ACATGGCCCA 
AGAATGAGCT 
TCACCACCAC 
ATCCTGAGGG 
A6TX3CACATA 
TGAACCTGTC 
TCCTGGCCAA 
TCTCCGTCTA 
AGGCCTTCAT 
T6GACCTGCA 
GCGCTAAGAT 
TCTGCTCAGC 
CAAGTTACCC 
A6GGCCAGAA 



31 
I 

TTGGTTTAGA 
GTCTQACAGC 
CCAAGATGCT 
AGCCCCQGAG 
CCCCAGTTCC 
CTCTGCACAT 
GGAOGCCTTQ 
GCA6CTGAGG 
GTCCCAGGGC 
GGACCCCATC 
GAGTGCGGTC 
TGT66CCCAC 
GGAG6CCCCC 
GACTGGGTCA 
GGTCATCACC 
GTACATTGAC 
CAACGTGACA 
CGATGGGGAA 
CCAGACACTC 
CTTCCGGACC 
6CTGAGCTGC 
CTCAGGTGGG 
GCTGACATGC 
TCCTTGTGGA 
TGAAAACACA 
GCTGCACCTG 



41 

I 

GGCX!AGCXX3G 
A6GTGAAGGT 
CTTCCCGAGG 
AGAGGCAGTC 
TCACAGTCGG 
CAGSACATCC 
CCCCCCAAGA 
CCCAAG6CCA 
CTAGATCTCC 

CCTACAACAC 
ACACTCCCCC 
CAG6AGGACA 
GCCTCAGAGG 
ACOGAGCAGG 
TCCAQCGACT 
GTCTACACT6 
CTGCTCTCCA 
CTGGTG6AG6 
TTCCAGGACG 
AACTTTCCCC 
GTGGCCCACT 
ATCAATGCCT 
GGGGCAGTGC 
AATGGGAGCC 
CACTTTGAGA 



51 
I 

ACGAGCTTTG 
CCTAAATCTC 
GAGATGCTAG 
CTGGCAAAGA 
CGGAAGTGCT 
GAGCCCT6TC 
AOAAACTGCC 
CCT0G6CAGC 
TCTCCTCCTC 
A6GAQGCATC 
CCGCACCCCT 
AGAGGCCAGA 
CCA6CCCCAT 
AGAGCCAGGA 
CACCAGCTCT 
ACCCACTGCT 
GCTATGGGGT 
TCCGCGGGGT 
GGCAGGTAAT 
ACGGCCTTGG 
GCC6GCCTGA 
TTCACTQCCA 
CCftAGCOSCft 
ACAATGCCAC 
AATTCTGCAT 
6GCT6TTGCT 



360 
420 
480 
540 
600 
660 
720 
760 



.60 
120 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 



60 
120 
160 
240 
300 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 



269 



wo 02/086443 

GG^TGl^CAAG 6A.CAi8Gft.TGA OGtsTTCAGAG GGGGCA6ACC AACAACsTCAG CTCTTCTCTA 1620 

OGACTCCCTT CAAACOOAGA GTGTCCCTTT TGAGGGCCTG CTGAGCGAAG GCAACACCAT 1680 

OCGCATCGAG TTCACGTCCX3 ACCAGGCCOG GGCGGCCTCC ACCTTCAACA TCCQATTTGA X740 

AGCGTTTGAG AAAGGCCACT GCTATGAGCC CTACATCCAG AATGGGAACT TCACTACATC 1800 

GSACCCSACC TATAACATTG GGACTATAGT G6AGTTC!ACC TGGGACOCOO GOCACTOOCT 1860 

GGAGCAGG6C CCGGCOVTCA TC6AATGCAT CAATGTGOGG 6ACCCATACT G6AATGACAC 1920 

AOAQCCCCTG TGCAGA6CCA TGT6T6GTGG GGAGCTCTCT GCTGTGGCTG GGGTGGTATT 1980 

GTCCCCAAAC TGGCCCGAGC CCTACGTGGA AGGTGAAGAT TGTATCTGGA AGATCCACGT 2040 

GGGA6AA6AG AAACGGATCT TCTTA6ATAT CCAGTTCCTG AATCTGAGCA ACAGTGACAT 2100 

CTTGACCAIC TAOGATGOGG ACGAGGTCAT GCCOCACATC TTGGGGCAGT ACCTTGGGAA 2160 

CA6T0GCCCC CAGAAACTGT ACTCCTCCAC GCCAGACTTA ACCATCCAGT TCCATTOGGA 2220 

COCTGCTGGC CTCATCTTTQ GAAAGGGCCA GGGATTTATC ATGAACTACA TA6AGGTATC 2280 

AAGGAATGAC TCCTQCTCGG ATTTACCCGA GATCCAGAAT GGCTGGAAAA CCACTTCTCA 2340 

CACGGASTTG GTGCX3GGGAG CCAGAATCAC CTACKAGTGT GACCCCGGCT ATGACATCGT 2400 

GGGGAGTGAC ACCCTCACCT GCGAOTGGGA C3CTCAGCTGG A6CAGCGACC CCCCATTTTG 2460 

TGAGAAAATT ATGTACTOCA COBACCCOQ6 AGAGGTGGAT CACTCGACCC GCTTA ATTTC 2520 

GGATOCTGTG CTGCTGGTGG GGACCACCAT CCAATACACC TGCAACCCOG GTTTTGTGCT 2580 

TGAAGGGAGT TCTCTTCTGA CCTGCTACAG CCGTGAAACA GGGACTCCCA TCTG6ACGTC 2640 

TCGCCTGCCC CACTGCGTTT CAGAAGOSGC AGCAGAGAOG TOGCTGGAAQ GGGGGAACAT 2700 

G6CCCT6GCT ATCTTCATCC CGGTCCTCAT CATCTCCTTA CTGCTGGQAG GAGCCTACAT 2760 

TTACATCACA AGAT0T06CT ACTATTCCAA CCTCCGCCTG CCTCT6ATGT ACTCCCACCC 2820 

CTACAGCCAG ATCACCGTGG AAACCOAGTT TGACAACCCC ATTTACX3AGA CAGGGGGAAC 2860 

CCAAAAGGTT TAGGGTTTCA TTTAAAAAGA GGTACCCTTT AAAAAGGGGC TTGTGAACTC 2940 

AACCCCAATT TCCCCGAGAC ATTTATCCAA AGGCCCTGGG GGCCTTGATT TAAACCCCCA 3000 

AAAGGCXSGCT GTTTTTTGGT TAAACTTTTT AACAAAGG6T TAOGGGTTTT TTCCCOGGAT 3060 
TTTATAAATT TTAAAAGTG 

Seq IP NO: 208 Protein sequence : 
Protein Accession ft: NP_066938 

1 11 21 31 41 51 

i I I I I I 

MAOEAPQBDT SPMALMDKGE NBLTGSASEE SQSTTTSTIl TTTVITTBQA PALCSVSFSK 60 

PBGYIPSSDY PIjLPUINFLE CTYNVTVYTG YGVELQVKSV HLSDGELLSI RGVDGPTLTV 120 

LANQTLLVEG QVIRSPTNTI SVYFRTPQDD GLGTFQIiHYQ APMIiSCNFPR RPDSGDVTVM 180 

DLHS6GVAHF HCHLGYGLQG AKMXiTCXKAS KPHHSSQEPI CSAP066AVH NATIGRVIiSP 240 

SYPEHTK6SQ FCIMTIEAPE GQKEiKLHFER LLXiaSKDRKT VBSGQfTflKSA XiIiYDSlOTES 300 

VPFEGLLSE6 NTZRIEFTSD QARAASTFMI RFBAFElCraC YEPYIONGHF TTSDPTyNZG 360 

TIVEFTCDPG HSLEQQPAII KCINVRDPYW NDTEPLCRAM CGGELSAVAG WLSPNWPEP 420 

yVEOEDCIWK IHVQEEKRIF LDIQFLNLSN SDILTIYDGD EVMPHILGQY LGNSGPQKLY 480 

SSTPDLTIQF HSDFAGLIFG XGQ6FIMNYI EVSRMDSCSO LPBIQNGHKT TSHTELVRGA 540 

RITYQa)F6Y DIVGSDTLTC QNDLSttSSDP PFCEKIHYCT DFGEVDHSTR LISDFVLLVG 600 

TTIQYTCNPG FVLEGSSLLT CYSRETGTPI WTSRLPHCVS BAAABTSLBQ GHMALAIFZP 660 
VLIlSliLLGG AYIYITRCRY YSNLRLPLHY SHPYSQITVE TEFI3NPIYET GQTQKV 

Seq ID NO: 209 DMA sequence 

NUcXeic Acid Accession ft> MM_001327.1 

coding sequence t 89-631 

1 11 21 31 41 51 

I I I I I i 

AGCAGGGGGC OCTOTOTGTA COGAGAATAC GAQUkTACCT OSTOOGOCCT QACCTTCTCT 60 

CTGASAGCGG GGCAGAG6CT CCGOAGCCAT GCAG6C08AA G60CGGGGCA CAGGGGGTTC 120 

GAOGGGOGAT GCTGATGGCC CAGGAGGCCC TGQCATTCCT GATGGCCCAG GGGGCAATGC 180 

TGGCGGCCCA GGAGAGGOGG GTGCCACGGG CGGCAGAGGT CCCCGGGGCG CAGGGGCAGC 240 

AAGGGCCTCG GG6CCX3GGAG GAGGCGCCCC GOGGGGTCCG CATGGCGGCG CGGCTTCAGG .300 

GCTGAATGGA T6CIGCAGAT GOGGGGCCAG GGGGCG8GAG AGCCGCCTGC TTGA6TTCTA 360 

CCTCGCCATG CCTTTCGCGA CACCCATGGA AGCAQAGCTG GCCCGCAGGA GCCTGGCCCA 420 

GGATGCCCCA COGCTTCCGG T6CCAGGGGT GCTTCTGAAG GAGTTCACTG TGTCCX3GCAA 480 

CATACTGACT ATCCGACTGA CTGCTGCAGA CCACCGCCAA CTGCAGCTCT CCAT CAGCTC 540 

CTGTCTCCAG CAGCTTTCCC T6TTGATGTG GATCACGCAG TGCTTTCTGC CCGTGTTTTT 600 

GGCTCAGCCT CCCTCAaQGC AGAOGOGCTA AGCCCAGCCT GGCGCCCCTT CCTAGGTCAT 660 

GCCrCCTCCC CTAOOGAATG GTCCCAGCAC GAGTGGCCAG TTCATTGTGG GGGCCTGATT 720 
OTTTGTCGCT GGAGGAG6AC GGCTTACATG TTTGTTTCTG TAGAAAATAA AACTGA6CTA 

Seq ZD NO: 210 Protein sequence: 
Protein Accession #i NP_001318.1 

1 11 21 31 41 51 

) ) I 1 I 1 

MQAEGR6TG6 STGOADGPGG P6IPD6PG(3I AGGPGEAGAT GGRGPRGAGA ARASGP6GGA . 60 

PRSPH06AA8 GUIOCCROQA RGPESRLLEF YLAMPFATPM EAELARR6LA QDAPPLPVFG 120 
VLIiKEFTVSG NZLTZRLTAA DHRQLQLSIS 8CLQQLSLLK WITQCFLPVF LAQFPSGQRR 

Seq ID NO: 211 DNA sequence 

Nucleic Acid Accession #: Eos sequence 

Coding sequence! 52-459 

1 11 21 31 41 51 

CCTCGTGGGC CCTGACCTTC TCTCTGAOAG CCGGGCAGAG GCTCCGQAGC CATGCAGGCC 60 

GAAGGCCAGG GCACAGGGGG TTCGACGGGC GATGCTGATG GCCCAGGAGG CCCTGGCATT 120 

GCTGATGGCC CAGGGGGCAA TGCTGGCGGC CCAGGAGAGG C6GGTGCCAC GGGGGGCAGA 180 

GGTCCCCGGO GG6CAGGGGC AGCAA6GG0C TCGG6GGCGA 6AGGAGGGGC 0CCGCGGG6T 240 

CGQCATGGCQ GTGCGGCTTC TGCQCAGGAT G6AA6GT6CC CCTGOGGGGC CAGGAGGGC6 300 
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QACAGGCGCC T6CTTCAGTT CCOACTGACT GCTQCAGACC ACCGCCAACT GCAGCTCTCC 
ATCAGCTCCT GTCTCCAGCA GCTTTCCCT6 TTGATGTGGA TCACGCAGTG CTTTCTQCCC 
GTGTTTTTGQ CTCAGGCTCC CTCAGGGCAG AGGOGCTAAG CCCAGCCTGG CXKXICCTTCC 
TAGGTCATGC CTCCTCCCCT AGG6AATGGT CCCAGCAC6A QTGGCCAGTT CATTGTGGGG 
GCCTGATTGT TTOTGQCTGG ASGAGGACGG CTTACATOTT TGTTTCTGTA GAAAATAAAG 
CT6AGCTA 

Seq ID NO; 212 Protein sequence: 
Protein Accession #: Bos sequence 



360 
420 
480 
540 
600 
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11 



21 



31 41 51 

1 I 'i I I I 

MQAEOQGTGG STGDADGP6G PGIPDGPGGN AGGPGEAGAT GGRGPRGAGA ARASGPRGGA 
PRGPH66AAS AQDGRCPCGA RRPDSRLLQP RLTAADHRQL QLSISSCLQQ LSLLMWZTQC 
FIiPVFLAQAP SGQRR 

Seq ID NO: 213 DKA sequence 
Nucleic Acid Accession #: NM_000555 
Coding sequence s 4 1 6 .. 14 98 



1 

I 

CTTATTTTTT 
AGCACAAAGA 
TCTGG6GGGA 
ACTCCCCCTT 
AACCTTGGGT 
TACAACTGTT 
CCCXSAAGTTC 
ACTT6ATTTT 
GATGAATGGO 
CTTGCAGGCA 
CCGCTACTTC 
CTTGCTGGCT 
TTACATTTAC 
G6AAAGCTAT 
CAATCCCAAC 
GGCTAGCAGC 
TACCATCATC 
GACAGCCCAC 
OGGGGTTGTC 
CTTTGGTGAT 
TOATTTTTCT 
TGGCCCAAAG 
CCGAAGCAAO 
CAAGTCTAAfl 
GGACCTGTAC 
GAGGGGAGAG 
TCTGCTCAAG 
TATTTTGAAA 
CACTGATCCA 
TCCAGGGATG 
TAAATTTSCC 
ATACAGCAAT 
ACAAACACAA 
GCAAGGCAGC 
CACCAATTCT 
TGCTTA6GAT 
ACATTTCGGA 
TTTGGAATCA 
TCTAGAAGGC 
TGTTCTATTG 
6GTCTGACAC 
CATGTCTGTA 
TTTTGTCTAA 
ATGTTATTTA 
AATAGCAGAG 
CATTAATAAT 
ATATTTTAAG 
CTTTCACTTG 
TAGTTTTCAG 
GCCCATAATG 
GCTTTNACCA 
CACAGCATCC 
CTAAATGGAA 
GTGTQTGTGT 
GTAATGGATT 
GTGGTGGTGG 
ACATTTTCXaG 
ACCCCAAATG 
CCATTAATAA 
GTGCCTCCCA 
GGGCATAAAG 
CACCCATAGT 
TG6AGGCTGG 
ACACTAGCTC 
GC3ATATATTT 



11 

I 

ATGAATGTCG 
CACTGGCTGT 
GGGfflkTGCAC 
CATAGTCATT 
AGCTCCTTCT 
AGTCATQTGG 
CAATTTGATA 
GQACACTTTO 
TTGCCTA6CC 
CTGAGTAAT6 
AAGGGGATTG 
GACCTGAC6C 
ACCATTGAT6 
GTCTGTTCCT 
TGGTCTGTCA 
AACAGTGCAC 
CGCAGTGGGG 
TCTTTTGA6C 
AAAAAACTCT 
GAT6ATGT0T 
CTGGATGAAA 
GCATCCCCAA 
TCTCGAGCTQ 
CAfiTCTCCCA 
CTGCCTCTGT 
T6CTCAGAGT 
TGTCCAACAG 
AACACATTGT 
CAGTTACCAA 
CAAAATGTGC 
CCGTTTAAAT 
TAAAAAGTTT 
GTGCCCCTTT 
TCCCCAGCCT 
GGCTGTCAAT 
CCTGGTGCTO 
AGAGTTTATA 
ATAAGCAATT 
TGTCTAACAT 
AATGCCTTGT 
CTTTCAGTCT 
TTTCAGGAGC 
AAAACACATG 
GAAATTATGC 
GCCAATTCAA 
TTCAAT6TG0 
CAACTCTTTT 
TCTTTAACAT 
TCTTTTGAGA 
6CAAAAACAA 
AATATAAAAA 
AAACCAAGCT 
TGAGCTTGCT 
GTGTGCATCT 
GGTGGCAACT 
GGTATCTCAA 
TTCAAGAAAA 
ATGAGGATCT 
GCCCATTTTA 
NAACATTTTO 
AATGGTGGGA 
NTCACTTTAG 
TAAAGAGCAG 
TNTGA6TATT 
TCTTTAGGAT 



21 

I 

GATAGCTGCA 
TCCCT6GAGG 
ACATTAC3AGT 
GTACTQAAAT 
GTTCTCTTCA 
GCATGTGTGA 
GGAGCCACTG 
ACGAAAOAGA 
CCACTCACAG 
AGAAGAAAGC 
TGTACXSCTGT 
GATCTCTGTC 
OATCCAGGAA 
CAGACAACTT 
ACGTAAAAAC 
AGGCCAGGGA 
TGAAGCCTCG 
AAOTCCTCAC 
ACACTCTOGA 
TTATTQCCTG 
ATGAATGCOG 
CACCTCAQAA 
ACTCAGCAAA 
TCTCTAOGCC 
CCTTGGATGA 
CCAGAGTACA 
GGCTATTGGT 
AATATGTTGG 
TTATGAGAGA 
TA6TCCATGA 
TTGCCCAAAC 
GTGTGGGGAA 
TCTCTGGATC 
CACTCTTCAC 
GQGGAGAAAT 
GGTTAGCTAA 
AAGCACAGTG 
GATAATAGTT 
ACCACATGAT 
TAACAGCCAA 
CTTTTTATAG 
AAACTCTTCA 
AAGAAAATTT 
TGTCACTGCC 
TA6AATCAGT 
ACCAGACATT 
TATCTATAAT 
TAGAAAGGAT 
TACAGGTTTA 
CTAATTTTAA 
TTCCCTTATT • 
GCTGTTTGGC 
GTGTGTGTGT 
GCAGCTGCTT 
GGGTGGCACT 
AT6CCCCTAG 
GItSAGATGAT 
CTTTTTGCCC 
CTAANCCCCT 
TAGTTAATTG 
GGCCTGATTT 
GTCTCATTTA 
GACCAGAGGA 
TCCTTGATTG 
AACCTTTQAA 



31 



CCAGCTTGGT 



A66AAAGAGG 
GCAAAGACTG 
AGGGQAATTP 
GGAAACAGAT 
TCAGTCTCTG 
TAAGACATCC 
CGCOCACTGT 
CAAGAA6GTA 
GTCCTCTGAC 
T6ACAACATC 
GATCGGAAGC 
CTTTAAAAAO 
ATCTGCCAAT 
GAACAAGGAC 
GAAGGCTGTG 
TGATATCACA 
TGGAAAACAG 
TGGTCCTGAA 
AGTCATGAAG 
GACTTCAGCC 
C6GAACCTCC 
CACCAGTCCT 
CTCGGACTCG 
AATCCAAGCC 
GCTTTCAAGT 
GTTTATTTTC 
TAGATTGATA 
gctttcaat6 
AGTTTTCCTT 
AAAAAAAACT 
TCAAGAATGG 
TCCPGATTGA 
AAACCAACAA 
GAGAATASAC 
AATTCCTGGT 
TGGA6TAA66 
TACATGAACT 
CACTGAAAAC 
CAAGAAATCA 
GOCTCCTTTT 
ACCAGAAAAA 
AAACAGTAAC 
TTTTTGATAG 
CTAATTATAT 
GCTAATATTT 
TTCTCTTTAC 
TAACACTGCT 
TTGAAGGTCT 
CCTTGGTAAT 
TACTGAATOG 
GTGGTGGTGG 
CAAAATTAAG 
GCT6ATGTGC 
ACAAGCTTCA 
GGTAGTACTG 
CCTCTCCTTT 
ATTTCTTTCT 
GGAAAAAGTG 
TAAAATTCAG 
GTCCATCACC 
A6AATCCAGA 
CX30TATATGT 
CCAACAATNT 



41 

I 

GGGGAAAGGG 
AAAGGAGAAT 
6CTTGGAATA 
CTTCCTAAGC 
TOTCAGGCTA 
GCCA6TTTTA 
AGGTTCCACC 
AGGAACAT6C 
A6CTTCTACC 
CGTTTCTACC 
CX3TTTTCGCA 
AACCT6CCTC 
ATGOATQAAC 
GTG6A6TACA 
ATGAAAGCX:C 
TTTGTGCGCC 
GGTGTGCTTC 
GAAGCCATCA 
GTAACTTGTC 
AAATTT06CT 
GGAAACCCAT 
AA6A0CCCTG 
AGCAGCCASC 
QGCAGCCrCC 
CTTGGTGATT 
TATCATTGTA 
TTTTATTTTG 
CTGTGATTTC 
ACCATCCTTT 
GAAA6CTTAG 
TTGTAGAGGG 
CATTGGCAGA 
TGGAGGACCC 
GGCCCGGGTT 
CTTATAATTG 
AOAATTGGAA 
CAATCTCTCC 
GACTTCATAT 
GTATGGTATC 
ACTGTGAGAA 
ATATCCTTTT 
TTATAAACT6 
AAAAAAAAAG 
CTCCAGGAGA 
CTTTTTAACA 
TTTAAATGAA 
CATACTGAAG 
TAAGGACTGA 
TTTTTTTTCC 
T6CTTGCCAM 
GGTGCA AAIN 
CTTGCAQTTG 
TGGGAGGGGG 
AAATACTACA 
ACTGTGTAGQ 
GAtGTCTGTA 
GTTTCTGGTG 
TTTTGTAAAC 
AGAAGCTCAG 
ATACTTGGAT 
GCCAGAACCC 
TTTATTTTAA 
TTTCCTTATG 
ACTACTAGAA 
TCAATAACAA 



51 
I 

TTTGATGAAT 
CTTAGTTTAT 
AAATGAAAAC 
TGGAGATGCT 
TGGATTCATT 
ATGTATTTAG 
AAAATATGGA 
GAGGCTCCOG 
GAACTAGAAC 
GCAATGQGGA 
GCTTTGACGC 
AGGGAGTGGG 
TGGAOGAAGG 
CCAAGAATOT 
CCCAGTCCTT 
CCAAGCTGGT 
T6AACAAGAA 
AACTGGAGAC 
TCCAT6ATTT 
ATGCTCAGGA 
CAGCCACAGC 
GTCCTATGCG 
TCTCTACCCC 
QGAAGCACAA 
CCATGTAAAG 
GTAGGGTACT 
TTGTTGTTGT 
TCCTCT6GGC 
GGGGCAGCAT 
GG6CCTGGGG 
6TGTTTAAAT 
TCCAAGAAT6 
TGGAAGGACA 
TGTTGTCCAG 
TGACACCAGA 
AATACTGCAO 
ACT6AGGCAA 
ACCTGATTCC 
CATCTATCTC 
TTTGTTTTCA 
TATAAAAATT 
GTGATTTTTC 
CXX3AAGAATA 
AAACAAGATG 
GTTATGCTTG 
ATGTTACA6C 
ACACAGAAAT 
TCATTTGAAA 
TGTAAACATA 
TCCTGTGTTG 
TTTGGAAAG6 
TTCCTCCACT 
TGGTGCATGT 
AGACACCCCT 
GGGGAACCCA 
GCTACCAAAA 
AAATTGAAAA 
CCATTCAAAA 
GGTTTNCTTA 
TAGGGGGTGT 
CCAATGACTC 
GTTGAGGAAG 
CTTGGGCCTC 
AATACCAAAT 
TAGTAGITCT 



60 
120 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
16S0 
1740 
IBOO 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2260 
2340 
2400 
2460 
2S20 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
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TCCATCTTAC TTTTAAT0C3A GTATAAGGAA ATGTTTCTTT ATGGCCATTT TGGA6GGAGC 3960 

AGGGGAT6AG GCTTGGCATA GTCCAAAATT TAAQTCTCCA ATAATTAATT GCATTTTAAA 4020 

TTGTTTTAAA TTGGCCCACT TTCAAC3GCAA TTTTTTTTGT aTGTCTOTAA CTOAGCTCCT 4080 

CCACCCCTGT CATTCACTTC CAATTTTACC CAATCCAATT TTAGCACTCA AGTTCCATTG 4140 

TGTTAATTTT TQCAOSGTCT ACACACATCA AGTCAGCAAG CATTTGOCAC CACTCXrCTAT 4200 

ACTTCTCCCT CTTTTTTACA CACACACACA CACACACACA CACAATCCAT CTCTTGCTTG 4260 

TTCCTACCTC CCTGATTTTT CTTCCCTACA GAAATAGAAA TAGGGACAAA GAAGGGGAAA 4320 

ATCTATATAT TGGGGCTGGG CTGAACAACT AACTTCATAA GTAGTATTAA CTAGGGGTAA 43B0 

ATTGAGAGAA AAGCTCCTTT TCTCTTCACT GTTTTGGAAA GGATAGCCAT TAGCATGACT 4440 

GCTTT6TGTC CTTATGGACT TTAGTATTAG CCTAQATTGA ATTATAGCOT TTTTCTAGCT 4500 

GAAG6AACCT TAA6ATCACA TCATCTACTC CTCTACTCCA AATTTCTCAT TCTTCAGGCC 4S60 

AGGAAACGGA GACACAGA66 TAAAGTAATT TCCCCAA6QT CACACAGCTG GCTGGG6CAG 4620 

GATTGOGTTT ACAACCCACA TCTCCTGGCT CTTATTCCAG QGCCTTTTCC CaCTAAGTAG 4680 

TATTGCCTTC CATTA6GCTC CTGAGAGTTA TTTCTCAOGO TCATGTTGCA TCTTGGAGCC 4740 

ACAT6CTGCT GOCCTGATCT. CAjSTOGGAAA TNCACCCAGC AACCTAATAC A6C0CCTTTT 4B00 

CCCTGCATTC ACCTGGTTCC CATCCACATG GGTTGCAGAT GTCCTTGAAG AGAGTGAGGC 4860 

ATTGAGGGCC AATAGGAGCA ATGGGGTCCC TGGCCTTGTC CATCTGATTC AGGAGATCAC 4920 

TGCTCCATCG TGAGGAGCCC TCTGAATAGC CCCCCACTGA ATGCTTGCCT TGCCCAAATG 4980 

GAATGGAGGA AGATTGATTT TCTCCATCAG TTCACCTTGT GTCATCTCAT AATGGTTGGT 5040 

CTTTCCAG6C TGA6GGAAAT GTTTCTTGTT TCCAKAGTAN AAAAAAGAAA GAGTGGAACA 5100 

ATANCTTTGT TCATCCTAAC TTTCTGAOAT GGCTTTTCAA CATTTAAAAA AAACTAOTOT 5160 

GGTACCATTC ACTGGCANGA TTTNTTTTAG AATATGGGAG TAAGATGAGG TAGAGAAAAT 5220 

AACCTGGTCT CACTGTGGTT GCCCTCATCC ACAATQTCCC CAAAGCCATC CTGCTNTGAT 5280 

GAGGACAATT TCCA6GTATA AGCAAG6GGC TTTGTOACAA AAATGTACGC TGGCTGATGT 5340 

TAAACATTGQ CTCCTQTOTT TGCACCAAAA TAGCftAQCTG TGTGCTCTAT ACACTCTTCC 5400 

CAT06TCTTG TGTACACTGC TCCI6T6GCC TTCCACAGCA GAAACCAGG6 CAAAAGGGTC 5460 

CAAACACATG GTTTTCCTTG CTGCAAGGCT NTTCCTGGGA ACTAAGGGGG TATTTATTAG 5520 

TTCAGTTNTA AGAGACCTCC TTCTGGGCTT ACX:CCACTCC TCAGGTACTT CTCTCTCCTT 5580 

CXnrCCTTCTC CTCCACAGTC ACAAGTAACC AAGGAACXTTG AAA6TGGATG TGTAGCTATT 5640 

TGAAOAAGGC AAGGAACCCT QAGATTCTTC TTTQAATCCT TTAGTCCAAG TCTTAGACCA 5700 

OTGATTGQTG CTTACCTTGA ACAAAATTTT GTCTOTOTTC CTAATCOCTT CgUVTA CniTG 5760 

GGTACAATGC TCCCAATCAC CCTGCACATT TGATTCTAAA TGGCTTTTAT TTTTTAAAAA 5820 

TCCATATCCC TAQGACAAGA NAACAGGATG CCTATATCCC CAAAATGAGC. TCCAGGACAC 5880 

TGAT6GGAAT GATCCCAANG ATCACCCCAC CTCAGAAAAC GTCTGXGCCA AKAG ACTT CC 5940 

CCAOATAOAA NCACTGGGAC AGTGOTTTOA AOQACTTCTT TTATGGTIGT CCAGTTT6CT 6000 

ATGGAAATAA AAQGCATTOA TTTTTTAAAA AAGATGATTG GAACCTGTCT TTGGC CACAT 6060 

AQGGCCACTT GGATCCATTT CCAGGCCTTA CTCATATATT 6CCTTCACTG AAGGGCTTTG 6120 

GCTTTAAGTC CCAGACTGGT CTCCCAAGTG AACCATAAGT GTTTTGGAGC TCATCTGGGG 6180 

TGAGGCATGA GAATGTTGCC CCATCTATCC CTTCAGGAAA AGGTGCCTTC CCT CCCT TTC 6240 

TOCTAAAGCC TQGTCCCCAA AAATTOTTTT TGTCTCCAAA AGTCTAGTAT GGTCTTTATA 6300 

CACCCANACT CTTAGTGTTG CGTCCTGCCT TGTTTCCTTG TTAAGGATCT ATGCANACCT 6360 

CCCGCTTTGG CTTAGCTAGC GTGACATTOO CTATCATTTO ACAAGACTAA CTTTTTTTTT 6420 

TTTTTTTTTG ACTQAGTCTC CCTCTGTCAC CTAGQCTGGA GTGCAGTGGC ACAATCTTGG 6480 

CT06CTGCAA CXTTTCACCCT TC31CCTCCC31 GGTCGAAGCS ATTCTCCT6C CTCAGTCTCC 6540 

OGAGTAGCTG GGATTACAOG GQTGGGCCAC CAAATCTQGC TATTTTTTTA TTATTATTAT 6600 

TTTTAGTAGA GATGGGGTTT CA0CATGTT6 GCCAGACIGO TCTTGAACTC TTGGCCTCAA 6660 

ATTATCTGCC CACCTC3GGCC TCCCAAAGTG CTGGQATTAC AGGCATGAGC ACCATGCCCA 6720 

GCTGACAAGA CTAATTTTTT ATCCCTTGGT TTATTGGCTT CAACATCTTC TGGAATCAGA 6780 

GGTGATTTTT TCTTACCTTG GATGCCTGAG ACTAGGGGAG TATAGAAITC CAATTGGTAA 6840 

TTAAGGCATC TTTCTGCTCC TGATCAGAAG GGCAGGTTAG TTGGGA6AGG TCAGATGGCA 6900 

CAACAGAAGT CACCTTGTAA GTAA6GCAAA 6ACTTTGAAG GCATTAGOGT TTCTCATTAC 6960 

TTAG6TCAAT AACCTTGAGG GAATCAATGG CTTTTTTGCC GCTCTACCTC TTTGTGTATC 7020 

TCTTTGACTT TTCTTTCTCT GTCTAOTTTC CTCTGTTCTC AGTTTATATT CTATOTTATC 7080 

AOTCTCTCTT TCCACAGTAC AAAC31TCCAT CCTTTCTCCT GTGCAATTCT GTCTCTCOCT 7140 

CTTATTATCr TTATTTGTAC TTTTTCCTTC CTCCCTGTCT AGGCATTGGG CATGTGCCTC 7200 

TTCTTAGCCT GTGATTTTGC CTTGGGACTG ATGATAAATT ATTTCCAGAT TCAATCAGCC 7260 

CTGQTCCTAC CCCAQTCCAA TCAGAAGTAT GTTGGTGGGG AATCAACCTG ATC CTGG CCC 7320 

TTTCTTCTTC TCCATTTTCA TTCX5TAATCC CCCTCAGCAG ATCTTTACAA GCA6TTTCCT 7380 

TATAGCTCAT GTATCTTTAG GTCTTTGCCT TCCAAGCACT GTACAGAATA CTTTGTOGTT 7440 

CXTTTTTTAGT CTGACATTTT GTGGAGCAGT GAAGC3GTGCT CAGAGACATA ATCAGCTQAA 7500 

GAGAAAAAAT CCACCCATGG ATTTATATCA GCTAAATACT AATAATTGAT TTTGTTTGAT 7560 

GTGCCCATAA TTTTTAAA6C' TGCAATA7AA TATAATGAGG 6ACCACAGGT AATTTCTCCT 7620 

GTCATTTGTT TTGGCTGGAT GGGGGTG6GG 6AGTAATTGC TTAAAGTTTT AOCATTACAC 7680 

ATTAAACTCT CTATAATAAT CTTGTTTQGQ GCTTOCTAAC TGTTGAGCTO TTTTAACTAA 7740 

ACTGGTAGGC AATCGGAGTT QATTTAAATG AAAAGATAAT TTAACAAATC TATACTATAA 7800 

AAAGAGACAT TTGCTTAATT GACATGTATT TTTTCCTTCT QAGTCACCTA AACATTTACT 7860 

CTT6ACACCA ACTGTTCATG ATACTGAATA 6ACA6TCCAT ATAAGAGAAA TTAGTGGACC 7920 

TAAA6AAGCC AOATTGTAGG TGTTAATTTA TTAAACAGAA TTGCAAAGCC CTTOGAAATG 7980 

TCACTGCTTG GCAATACCAT ATGGCATGCC AAAATTTACA ATGACTTTTC TTTATAAGTT 8040 

ATCCAAAAGG GATTTGAACA AGTAAGAGGT TATGCCAAAA TGTCTCCAAT GTATGGTCCT 8100 

GTAATATATT GCAGCTTGAA GCCAATGATC CCTTATGACT TGTA TACAAC TAATGCATGT 8160 

TTTATTGAAT TTTGCATTTC CCAOSTGTGG TAAGTCTTTA AAATGTTTTT GATCACCTTT 8220 

NT6T6CCATT AAACTTGTAC AGAAAAT8TT TTTATG6CCA TTTTCAAAGG GAGAAAOTTT 8280 

AAAATGGAAA CAGCCCACCC TTTCTGCCCT ATAGCTGTAQ TTAGAATTGA OTACCrGTAO 8340 

CAAAACAGCT GTAATTGGTG GTT6TAGTGT TAGAGGTGTT AGCTTGCTAG TGACTA6CTT 8400 

TGQAGAGTAA ATGCATGGTA TTOTACATCA CATTTCTTAA CTOGTTTTAA CCTCTGAAAA 8460 

QAATATATTC TTCTTTGTAG TCCTTCTTCC CACCCCCTTG CCCTCTCOCT CTCCCT6CTC 8520 

GCAGTTGTCT TACAOTTGTA AATATCTQAT TTGAG6CCCA ATAACTCTTG CCAAGTAAAG 6580 

TCAGCAAACA ACAAACAAAC CAAAATGTGG GGAAAAG6CA TTTCTCAACC ATCTCTCAGC 8640 

AGTTATT6AT CATTTCTTAA GGAACAGCAT TGTGATCAAA GACTCAACTT TAC6TAAAAA 8700 

TCAGTGGTAA ATTGGGGrTG TATTGGCCAT TGATTACATT CAGGATTGAA TAGTTTTCAG 8760 

AATCACATGT AATCCAAAGA CAGTAGGTAG TGATGTCCCT TATCCCT6CA GCTGTTTTAA 8820 

QATA6AGACC TCAGAAGACT CTGCTTGACC 6ATGAGCAAT AATTATTTGA AAAAAAAAOA 8880 

AAAAATGAQA GAAATAAAAC AGATATTTAA GAACTTTAGC CACCTATTTA GAATA8TTAT B940 

AGCCA6AAAA AAAAACAAGG GCATGAGTTC AAATGCATTA CTATCAGT6T CCTAGGCAAT 9000 

ACCTAACCTA CTCTGAAATT GT6ATTCAAA AGCAGTATTT CAAGAGGCAT TCTCCTTTTT 9060 

TGGTTTGCTG ACCCCACTTG GACTGGTAG6 TTTGGTGAG6 CCCCCATAAA CCA6CTGGA6 9120 
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CA(3ACCCTTT TCATCTCCTG TGCCTGTAAC ACCCCTCTTC CCCCACCCCC TCCGCAATTC 91B0 */^*-r/iF 

AATGAGGGCT TTCTTQGGTC AGAQSACTTC AAGGTTGTCT AGAGAAGTTT GCCATGTGTG 9240 

TAAGGTGCTG TGAACTGTGA GTGCTGAAGA TTCGCAGC3VT TCAATACCAG GCAGCCAAAG 9300 

- AGCTGCTCTT GCAATTATTT TGGCTCTC3UI GCTCTGTTCT TCATOSCATT CTCATTTCTG 9360 
J TGTACATTTG CAAGATQTGT GTAATGTCAT TTTOCAAAAA TAAAATTT6A TTTCAAT 



10 



20 
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Seq ID NO: 214 Protein sequence: 
Protein Accession fti NP 000546 



1 11 21 31 41 51 

I I I I 1 I 

^ . MBLDFGHFOE ROKTSRNMRG SRMNGZiPSPT H&AHCSPYRT RTLQALSNEK KAKKVRFYRN 60 

ID QDRYFKOIVY AVSSDRFRSF OAIiIADIiTRS LSDNINLPQG VRYIYTIDOS RKIGSMDBLE 120 

BGBSYVCSSD NFFKRVEVrK NVNPNWSVNV KTSANMKAPQ SIiASSNSAQA REHKDFVRPK ISO 

LVTIIRSGVK PRKAVRVLIiN KKTAHSPEQV LTDITEAIKL ETGWKKLYT LDGKQVTCLH 240 

DPFGDDDVFI ACGPEKFRYA QDDFSLDBME CRVMKOTPSA TAGPKASPTP QKTSAKSPGP 300 
MRRSKSPADS AN6TSSSQLS TPKSKQSPIS TPTSPGSLRK HKDLYliPLSL DDSDSLGDSM 



Seq 10 NO: 215 DNA sequence 
Nucleic Acid Accession f^t im_130467 
Coding sequence: 3 12.. 64 4 ~ 

1 11 21 31 41 51 

i I I I I I 

GGOVOQAGGC AGAGCTCTGC AAGGAQAGGT TGTGTCTTCX3 TTCTTTCCGC CATCTTCGTT 60 

CTTTCCAACA TCTTOGTTCT TTCTCACTQA CCSA6ACTCA GCCGGTAGGT CTGCAQAGTG 120 

GTCTTCCTGG TAATTTAOTT GTGAGTGAAT GTGT6QAGGA GCCAGCGGGC TTAGGACAGG 180 

TCCTGTGGCA CAGTOCGTGG CTTTGAGGGA AAAGGGOCTC GOGGTGGTCC TCCGCCTTCC 240 

CCCAGGTOGT GATGCAG3CG CCATGGGCOG GTAATC6T0G CT6GGCTGGA ACGAGGGAGG 300 

AAOTGAGAGA TATQAGTGAG CATGTAACAA GATCCCAATC CTCAGAAAGA GGAAATGACC 360 

AAGAOTCTTC CCAQCCA6TT GQACCTGTGA TTGTCCAGCA GCCCACTGAG GAAAAACGTC 420 

AAGAAGAGGA ACCACCAACT GATAATCAGG GTATTGCACC TAGTGGGGAG ATCAAAAATG 480 

AAGGAGCACC TGCTGTTCAA GGGACTGATG TGGAAGCTTT TCAACAGGAA CTGGCTCTGC 540 

TTAAGATAGA GGATGCACXTT GGAQATGGTC CTGATGTCAG GGA6GGGACT CTGCCCACTT 600 

TTGATCCCftC TAAAOTGCTG GAAGCAGGTG AAOGGCAACT ATA0(3TTTAA ACCAAGACAA 660 

ATGAAC3ACTG AAACCAAOAA TATTGTTCTT AT6CTGGAAA TTTOACTGCT AACATTCTGT 720 
TAATAAAOTT TTACAGTTTT CT6CMAAAA AAAAAAAAAA AAA 



45 
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Seq ID NOt 216 Protein sequence t 
Protein Accession #: NF_569734 

1 11 21 31 41 51 

I I I I .I I 

MSEBVTRSQS SERGNDQESS QPV6PVIVQQ PTREKRQEEB PPTDHQ6IAF SGEIKHEGAP 60 
AVQGTDVEAF QQELALLKIE DAPCHIGPDVR BQTIjPTFDPT KVLBAOSGQIi 

Seq ID HOi 217 DNA sequence 

nucleic Acid Accession #t HM_001476.1 

Coding sequence t 82.. 435 " 



1 11 21 31 41 51 

1 I I I I I 

GCCAGGGAGC TGTGAGGCAG TGCTGTGTOO TTCCTOCCGT CCOOACTCTT TTTCCTCTAC 60 

TGAGATTCAT CTGTGTGAAA TATGAGTTGG CSGAGGAAGAT CGACCTATTA TTGOCCTAQA 120 

CCAACSGCGCT ATGTACAGCC TCCTGAAGTQ ATTGGGCCTA TGOGGCCCGA GCAGTTCAC5T 180 

GATGAAGTGG AACCAGCAAC ACCTGAAOAA GGGGAACCAG CAACTCAAOG TCAGGATCCT 240 

GCAGCT6CTC AGGAGGGASA GGATGAGGGA GCATCT6CAG GTCAAGGGCC GAA6CCTGAA 300 

GCTQATAGCC AGGAACAGGG TCACCCACAG ACTGGGTGTG AGTGTGAAGA TGGTCCTGAT 360 

^ GGGCAGQAGG TGGACCCGCC AAATCCAGAG GAGGTGAAAA OSCCTGAAGA AGGTGAAAAG 420 

0«> CAATCACAGT GTTAAAAGAA GACACGTTGA AATGATGCAG 6CTGCTCCTA TGTTGGAAAT 480 
TTGTTCATTA AAATTCTCCC AATAAAGCTT TACAGCCTTC T6CAAAA 

Seq ID MOt 218 Protein seqiaences 
Protein Accession #t IIP_001467.1 

1 11 21 31 41 51 

I I I I I I 

MSWRSRSTYY HPRPRRYVOF PEVIGPMRPE QFSDEVBPAT PEE6EPATQR QOPAAAQEGE 60 
DEGASAGQOT KPBADSQEQG HPQT6CECED GPZX3QEVDPP NPEEVKTPEE GEKQSQC 

Seq ID NOt 219 OHA sequence 
Nucleic Acid Accession «: NM_001476 
Coding sequence: 90-3671 

1 il 21 31 41 51 

I I 1 I I 1 

ACA6GG6AGC 6CA6AGTGAG AACCACCAAC G6AGG0GCGG GGCAGC6ACC CCTGCAGCG6 60 

AGACAGAGAC TGAGOQGCCC GGCACCGCOV TGCCTGCGCT CTGGCTGGGC TGCTGCCTCT 120 

GCTTCTCGCT CCTCCTGCCC GCAGCCCGGG CCACCTCCAG GAGGGAAGTC TGTGATTGCA 180 

ATGGGAAGTC CAGGCAGTGT ATCTTTGATC GGGAACTTCA CAGACAAACT GGTAATGGAT 240 

TCCGCTGCCT CAACT6CAAT GACAACACTG ATGGCATTCA CTGCGAGAAG TGCAAGAATG 300 

GCTTTTACCG GCACAOAGAA AGGGACCGCT GTTTGCCCTG CAATTGTAAC TCCAAAGGTT 360 
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CTCTTAGTGC TCOATGTGAC AACTCTGGAC GGTGCAGCTG TAAACX»GGT GTGACAGOAG 420 

CCAGA3GCGA CCGMOTCT6 CCAG6CTTCC ACA3!6CTCAC GCa^TGGGGGO rtGCACCCAAG 480 

ACCA6A6ACT GCTAGACTCC AAGTGTGACT GT6ACCCA6C TGGCAT08CA OGGCCCTOIG 540 

ACGOGOGCCG CTGTGTCTGC AAGCCAGCTG TTACTGGAGA AOGCTGTOAT AGGTGTOCSAT 600 

CAGGTTACTA TAATCTGGAT GGGGGGAACC CTGAGQGCTG TACCCAGTGT TTCTGCTATG 660 

G6CATTCAGC CAGCTGCOGC AGCTCTGCAG AATACAGTGT CCATAAGATC ACCTCTACCT 720 

TTCATCAAGA, TGTTGATGGC T6GAAG6CT6 TCGAAOGAAA TOGGTCTOCT 6CAAAGCTCC 780 

AATGGTCACA GCGCCATCAA GATGTGTTTA QCTCAGCCCA, ACGACTAGAC CCTGTCTATT 840 

TTGTGGCTCC TGCCAAATTT CTTGGGAATC AACAGGTGAG CTATGGGCAA AGCCTGTCCT 900 

TTGACTACCG TGTGGACAGA GGAGGCAGAC ACCCATCTGC CCATGATGTG ATTCTGGAAQ 960 

GTGCTGGTCT AC6CSATCACA GCTCCXTTTGA TGCCACTTGG CAAGACACTQ CCTTGTGGGC 1020 

TCACCAAGAC TTACACATTC AflGWftAATG AGCATCC3kA0 CAATAATTGG A6CCCCCAGC 1080 

TCAGTTACTT TCAGTATOGA AGGTTACTGC GGAATCTCAC AGCCCTCCGC ATCCGAGCTA 1140 

CATATGGAGA ATACAGTACT GGGTACATTG ACAATGTGAC CCT6ATTTCA OCCCX3CCCT0 1200 

TCTCTGGAGC CCCAGCACCC TGGGTTGAAC AGTGTATATG TCCTGTTGGG TACA AGGGG C 1260 

AATTCTGCXA GGATT6TGCT TCTGGCTACA AGAGAOATTC A60GAGACTG GGGCCTTTTG 1320 

GCaCCTOTAT TCCTTCTAAC TGTCAAGGG6 GAGGGGCXnO TQATCCAGAC ACAGGAGATT 1380 

GTTATTCAGG GQATOAGAAT CCTGACATTQ AGTGTGCTGA CTGCCCAATT GGTTTCTACA 1440 

ACGATCCGCA CGACCCCCGC AGCTGCAAGC CATOTCCCTS TCATAACGGG TTCAGCTGCT 1500 

CAGTGATGCC GGAGACGGAG GAGGTGGT6T GCAATAACTG CXXTTCCOGGG GTCACCGGTG 1560 

CCOGCTGTGA GCTCTGTGCT GAT6GCTACT TTGG6GAC0C CTTTGGTGAA CATOGCCCAG 1620 

TOAGGCCTTQ TCAGCCCTGT CAATGCAACA ACAATGTGGA CCCCAGTGCC TCTGGGAATT 1680 

GT6ACC6GCT GACAGGCAGO TOTTTGAAOT 6TATCCACAA CACAGCCGGC ATCTACTOCG 1740 

ACCAGTGCAA AGCAGGCTAC TTCGGGGACC OVTTGGCTCC CAACCCAGCA GACAAGTGTC 1800 

GAGCTTGCAA CTGTAACCCC ATGGGCTCAG AGCCTGTAGG ATGTCGAAGT GATOGCACCT 1860 

GTGTTTGCAA GCCAGGATTT GGTGGCCCCA ACTGTGA6CA TGGAGCATTC AGCTGTCCAG 1920 

CTTGCTATAA TCAAQTQAAO ATTCAGATGG ATCAGTTTAT GCAGCAGCTT CAGAGAATGG 1980 

AQGCCCTOAT TTCAAA6GCT CAGGGTGGTG ATGGAGTAGT ACCTGATACA GAGCTGGAAG 2040 

GCAGGATGCA GCAGGCTGAG CAGGCCCTTC AOGACATTCT 6AGAGAT6CC CAGATTTCAG 2100 

AAGGTGCTAG GAGATCCXTTT GGTCTCCAGT TGGCCAAGGT QAGOftOCCa^ GA6AACAGCT 2160 

ACCAGAGCCQ CCTGGATGAC CTCAAGATGA CT6TGGAAAG AGTTCGGGCT CTGGQAAGTC 2220 

A8TACCA6AA COGAGTTCGG GATACTCACA GGCTCATCAC TCAGATGCAG CTGA6CCTGG 2280 

CAGAAAGTGA A6CTTCCTTG GGAAACACTA ACATTCCTGC CTCAGACCAC TA CGTG GGGC 2340 

CAAATGGCTT TAAAAQTCTG GCTCAOGAGG CCACAAGATT AGCAGAAAGC CACGTTQAGT 2400 

CAGCCAGTAA CATGGAGCAA CTQACAAGG6 AAACTOAOGA CTATTCCAAA CAAGOCCTCT 2460 

CACTGGTGCXS CAAGGCCCTG CATGAAGGAG TCX5GAAGCGG AA6CGGTAGC CCGGAOSGTO 2520 

CTGTQOTGCA A0QGCTT6TG GAAAAATTG6 AGAAAACCAA GTCCCTOGCC CAGCA6TTGA 2580 

CAAGGGA66C CACTCAAGCG GAAATTGAAG CAGATAGGTC TTATCAGCAC AGTCTCCGCC 2640 

TCCTGGATTC AGTGTCTCGG CTTCAGGGAG TC3W3TGATCA GTCCTTTCAG GTG6AAGAAG 2700 

CAAAGAGGAT CAAACAAAAA GCGGATTCAC TCTCAACGCT GOTAACCAGG CATATGOATO 2760 

AGTTCAAGCG TACACAAAAG AATCTGQQAA ACTGGAAAGA AGAAGCACAG CAOCTCTTAC 2820 

AGAATGGAAA AAGTGGGAGA GAQAAATCAG ATCAGCTGCT TTCCCGTGCC AATCTTGCTA 2880 

AAAGCAGAGC ACAAGAAGCA CTGAGTATGG GCAATGCCAC TTTTTATGAA GTTGAGAGCA 2940 

TCCTTAAAAA CCTCAGAGAG TTTGACCTGC AGGTGGACAA CAGAAAAGCA 6AAGCTGAAG 3O0O 

AAGCCATGAA GAGACTCTCC TACATCAGOC AGAAGGTTTC AOATGCCftOT GACAAQACCC 3060 

AGCAAGCAGA AAfiAGCCCTG GGGAOCOCTG CTGCTGATGC ACAGAGGGCA AAGAATOGGG 3X20 

CCGGGGAGGC CCTGGAAATC TCCAGTGAGA TT6AACAGGA GATTQGGAGT CTGAACTTGG 3180 

AAGCCAATGT GACAGCAGAT GGAGCCTTGG CCATGGAAAA GGGACTGGCC TCTCTGAAGA 3240 

0TGAQAT6AG GGAAGTGGAA GGAGAGCTGG AAAGGAAGGA GCTGGAGTTT GACACGAATA 3300 

TG6ATGCAGT ACAGATGGTG ATTACA6AAG CCCAGAAOGT TGATACX»GA GCXAAGAAGG 3360 

CTGGGGTTAC AATCCAAGAC ACACTCAACA CATTAGACX3G CCTCCT6CAT CTQATGGACC 3420 

AGCCTCTCAG TGTAGATGAA GAGGQGCTGG TCTTACTGGA GCAGAAGCTT TCCOGAGCCA 3480 

AGACCCAGAT CAACAGCCAA CTGCQGCCXa TGATGTCAGA GCTGGAAGAG AGGGCACGTC 3540 

AGCAGAGGGG CCACCTCCAT TTGCTGGAGA CAAGCATAGA TGGGATTCTG GCTGATGTGA 3600 

AGAACTTCGA 6AACATTAG6 GACAACCIGC CCCCAGGCTG CTACAATACC CAGGCTCTT6 3660 

AGCAACAGTC AAGCTGCCAT AAATATTTCT CAACTGAGGT TCTTGG6ATA CAGATCTCAG 3720 

GGCTCX3GGAG CCATGTCATG TQAGTGGGTG GGATGGG6AC ATTTGAACAT GTTTAATGGG 3780 

TATGCTCAG6 TCAACTGACC TGACCCCATT CCTGATCCCA TGGCCAGGTG GTTGTCTTAT 3840 

TGCACCATAC TCCTTGCTTC CTGATCCTGG GCAATQAQGC AGATAGCACT GGGTGTGAGA 3900 

ATCATCAAGG ATCTGGA<rC CAAAGAATAG ACTGOATGGA AAOACAAACT GCACAGGCAG 3960 

ATGTTTGCCT CATAATAGTC GTAAGTGQAG TCCTGGAATT TGGACAAGTG CTGTTGGGAT 4020 

ATAGTCAACT TATTCTTTGA GTAATGTGAC TAAAGGAAAA AACTTTGACT TTGCCCAGGC 4080 

ATGAAATTCT TCCTAATGTC AGAACAGAGT GCAACCCAGT CACACTGTGG CCAGTAAAAT 4140 

ACTATTGCCT CATATTGTCC TCTGCAAGCT TCTTGCTGAT CAGAGTTCCT CCTACTTACA 4200 

ACCCAGQGTC 'TOAACATGTT CTCCATTTTC AAGCTGGAAG AAGTGAGCAG TGTTGGAGTG 4260 

AGGACCTGTA AGGCAG6CCC ATTCAGAGCT ATGGTGCTTG CTGGTGCCTG CCACCTTCAA 4320 

GTTCTGGACC TOGGCATGAC ATCCTTTCTT TTAATGATGC CATGGCAACT TAGAGATTGC 4380 

ATTTTTATTA AA6CATTTCC TACCAGCAAA GCAAATGTTG GGAAAGTATT TACTTTTTOQ 4440 

GTTTCAAAGT GATAGAAAAG TGTGGCTTGG GCATTGAAAQ AGGTAAAATT CTCTAGATTT 4500 

ATTAGTCCrA ATTCAATOCT ACTTTTOGAA CACCAAAAAT GAT6C3GCATC AATGTATTTT 4560 

ATCTTATTTT CTCAATCTCC TCTCTCTTTC CTCCACCCAT AATAAGAGAA TGTTCCTACT 4620 

CACACTTCAG CTGGGTCACA TCCATCCCTC CATTCATCCT TCCATCCATC TTTCCATCCA 4680 

TTACCTCCAT CCATCCTTCC AACATATATT TATTGAGTAC CTACTGTGTO CCAGQGGCTG 4740 

6TGGGACAGT QGTGACATAG TCTCTGCCCT CATAGAGTTG ATTGTCTAGT GA OQAAG ACA 4800 

AOCATTTTTA AAAAATAAAT TTAAACTTAC AAACTTTGTT TGTCyVCAAGT GQTGTTTATT 4860 

GCAATAACC36 CTTGGTTTGC AACCTCTTTG CTCAACAQAA CATATGTTGC AAGACCCTCC 4920 

CATGGGGGCA CTTGAGTTTT GGCAAGGCTG ACAQAGCTCT GGGTTGTGCA CATTTCTTTG 4980 

CATTCCAGCT GTCACTCTGT GCCTTTCTAC AACT0ATT6C AACAGACTGT TGAGTTATGA 5040 

TAACACCACSr GGGAATTGCT GGAGGAACCA GAOGCACTTC GACCTTGGCT GGGAAGACTA 5100 

TOGTGCT60C TTGCTTCTGT ATTTCCTTGG ATTTTCCTQA AAGTGTTTTT AAATAAA6AA 5160 
CAATTGTTAG ATGCC 

Seq ID NO: 220 Protein sequence i 
Protein Accession #sNP_005553 

1 11 21 31 41 51 

I 1 1 I I I 
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MPALWLGCCL CPSLLLPAAR ATSRREVCDC N6KSRQCIFD 
DGIHCERCKK GFYRHRERDR CLPC33aiSKG SLSARCmiSG 
HMLTDAGCTQ DQRLLDSKCD CDPA6IA6PC DAGRCVCKPA 
PEGCTQCPCY OaSASCRSSA EYSVHKITST FHQDVDGWKA 
SSAQRLIDPVy FVAPAKFLGN QQVSYGQSlJs FDYBVDRGGR 
MPLGKTLPGO LTICTYTFRUJ EHFSIQIWSPQ LSYFEYRRLL 
DNVTLISARP VSGAPAPWVE QCICPVGYKG QFCQDCASGY 
GGAC3)PDTGD CYSGDENPDI ECATCPIGFY KDPHDPRSCK 
CNNCFPGVTG ARCEIiCADGY FGDPFGEHGP VRPOQPCQCN 
CIHSTAGIYC DQCKAGYFCn> PIAPNPADKC RACaiCNPNQS 
NCBHGAFSCP ACYNQVKIQM OQFMQQliQRM EALISRAQGG 
QDILRDAQIS E6ASRSLGLQ LAKVRSQENS YQSRLDDLXM 
RLITQMQLSli AESEASLGMT NIPASDHYVG PKGFKGLAQE 
ETEDVSKQAL SLVRKALHEG VGSGSGSFDG AWQGLVBKL 
ADRSYQHSLR LLDSVSRLQ6 VSDQSFQVEE AKRIXQKADS 
NWKEEAQQLL QN6KSGREKS DQLLSRANLA KSRAQEALSM 
QVDNRKAEAE EAMKRLSYIS QKVSDASDKT QQABRALGSA 
lEQEIGSLNfL EANVTADGAL AKBKGLASIiK SEMREVEGBL 
AQKVDTRAiCN ASVTIQDTLM TLDOLLHLMD QPLSVDEEX3L 
MMSELEERAR QORGBLBLLS TSIDGILADV 



Seg ID NO: 221 DNA sequence 
Nucleic Acid Accession #: NM_016529 
Coding sequencer 13-1B54 



RELHRQTGNG 
RCSCKPGVTG 
VTGERCDRCR 
VQRMGSPAKL 



GTCAAGAAAA 
AAAGGGGCTG 
ACATTATOGC 
GCTGATCTCT 
ATATTGAAGG 
CTGCTACTTG 
OCAACACTGT 
GGQATTAATA 
AA66AGGACT 
AATTTGCTGG 
GOGCTCTCCT 
ATATGCTGCA 
GTGAAGGCCA 
GCCCACGTGG 
TACGCCATCG 
TACAACGGGG 
ATTGAGCTTT 
TGCATCGGCC 
GAGAGGTCTT 
AATGGCQAAG 
TCCCTCATCC 
G6TCAT6CTA 
GTTTGTCTGA 
TGGGGAAGCA 
ATTCCXaTTG 
TG6TTGQQAT 
GCCAASCACA 
C6AGTCCTGG 
C6CCTGATCA 
CAGCAGG6CG 
GAAGAA6TCA 
AATTTTCCTO 
TTTGTCAGAG 
AGTTAAGCAG 
AGCTATCTTT 
ATGAAGCATT 
AGAAAAAAAA 




6AGAGGACAG 
TCCn^CIQCC 
GACATTGCTG 
GCTGCGGGAT 
CCGGAAGACX3 
GTATGCTTTT 
TGACACCACC 
GQAAAOAGAT 
TCCAAGGCCA 
ACATATTCCC 
CTCGTCTGCA 
CTOTGAGGTC 



31 

1 

ACTCCTTCAG 
CTTTCAAAAG 
GAAG6CTTGC 
TGGCTGAAAG 
GAGTGTTAOG 
CXSCCTTCAAQ 
TGGGTGTTGA 
GTATCGCAGA 
GCCATTACTC 
CTCATCATC30 
CTGGATTTGG 
TCT6A6ATA6 
GGCGCCAAC3G 
GAAGGCATGC 
AAGCTTCTGT 
T6CTTCTATA 
TTTTCTGGGC 
GCTTTGCCGC 
AGGTTTCCCC 
TGGGQTCACT 
GCTCTQGAGC 
AATATTGTTT 
GCTTGGACTA 
TTTGGCATCT 
6CAACTAT66 
TGXTTQATTG 
GAGGAG6TGC 
AGCAATGQAA 
CCCCOSAOGC 
TCTCAA8AA6 
AAAAAGAAAT 
TCAGTTTQTT 
AAACACCAGG 
TCGCAAACCT 
GTGCTTAGCC 
TCAAATTAAA 



Seg ID NO: 222 Protein sequence t 
Protein Accession #: NP 057613 



MSVIVRTPSG 
ENEYEBWLKV 
KABIKIWVLT 
KENDVALIID 
TLAIGDGAND 
TKCILYCFYK 
TQESMLRFPQ 
DYLPVGNIVY 
FDMRGQATMV 
KAVLRDSNGK 
RAYDTTXKKS 



11 

I 

RLRLYCKGAD 
YQBAETILKD 
GDKQETAINI 
GHTLKYALSF 
VGMIQTAHVG 
NWLYIIBLW 
LYKITQN6EG 
TYWVTVCLK 
I.SSAHFWLGL 
RUIERDRLIK 
RKK 



21 

1 

NVIPERLSKD 
RAQRLBECYE 
GYSC3U/VSQN 
EVRRSFLDIiA 
VGISGNEGMQ 
FAFVNGFSGQ 
FNTKVFWCSC 
AGLETTAWTK 
FLVPTACLIE 
RLGRKTPPTL 



31 

1 

SKYMEEETLCH 
IIEKNLLLLG 
MALIIiLKEDS 
LSCKAVICCR 
ATMNSDYAIA 
IIiFBRHCZGL 
INALVHSLIL 
FSHLAVW6SM 
DVAWRAAKHT 
FRGSSLQQGV 



RNLTALRIRA 
KRDSARLGPF 
PCPCHNGPSC 
NNVDPSASGN 
EFVGCXSDGT 
DGWPDTELE 
TVERVRALGS 
ATRIAESHVE 
EKTKSIiAQQL 
LSTIiVTRHMD 
GHATFYSVES 
AADAQRAKKG 
ERKELEFDTN 
VLLEQKLSRA 
PFGCYMTQAL 



41 

I 

GACGACTTCG 
ACTCAAAATA 
GGACTCTCTG 
TCTATCAG6A 
AGATCATTGA 
CAGOAOTTCC 
CA6GAGACAA 
ATATGGCCCT 
AGCACTGCaC 
ATGGCCACAC 
CACTCTCGTG 
TGGATGTGGT 
ATGTCGGGAT 
AGGCCACCAA 
TGGTTCATGG 
AGAAGGTGGT 
AGATTTTATT 
CCTTCACTCT 
AGCTCTACAA 
GCATCAACGC 
ATQATACTGT 
AGACATATGT 
AATTCAGTCA 
ACTCGACCAT 
TCCTGAGCTC 
AAGATGTGGC 
AG6AGCTGGA 
AGAGGCTGAA 
TGTTCCGGGQ 
AACACGGA6C 
CCAGGAAGAA 
GCACCCAGTQ 
AAACACATTT 
GGAGTGCAGA 
TAACmTTGT 
AACATTATGT 



41 

\ 

LEYFATEXSIjR 
ATAIEDRLQA 
LDATRAAITQ 
VSPLQKSEIV 
QFSYXiEKUjL 
YNVIFTALPP 
FWFPMKALBK 
LTWLVFFGIY 
CKKTXiLEEVQ 
PHOYAFSQEB 



FRCLKCSIBNT 
ARCDRCLPGF 
SGYYNLDGGN 
QWSQRSQDVF 
GAGIAITAPL 
TY6BYST6YI 
GTCIPCNCQG 
SVMPETEEW 
amLTGRCLR 
CVCKP6FGGP 
GR^4QQAEQAL 
QYQNRVRDTH 
SASNMEQLTR 
TREATQAEXE 
EFKRTQKZIU3 
ILKNZjREFDL 
AGEAIiSISSE 
MDAVQMVITE 
KTQINSQLRP 



51 

I 

GCTTTACTGT 
TATGGAGGAA 
TGTGGCTTAT 
AGCCAGCACC 
GAAGAATTTG 
AGAAACCATC 
ACAAGAAACT 
TATCCTATTG 
TGACCTTGGG 
CCTGAAGTAC 
CAAA6CGGTC 
GAAGAAGC6G 
GATCCAGACA 
CAACTCGGAT 
AGCCTGGAGC 
CCT6TATATT 
TGAA06TT66 
GGQAATCTTT 
AATCACCCAG 
CTTGGTCGAC 
GTTTGACAGT 
TGTTGTTACT 
TCTGGCTGTC 
CTGGCCCACC 
CGCACACTTC 
ATG@tf3ASCA 
AACCftAOTCT 
CGAGOGCGAC 
CAGCTCCCTG 
T6TTAGTCAG 
ATAASACATG 
TTAACACATC 
CTGTGGCCTT 
CCACAGGGGA 
TTATGTCGTT 
TTCACCAATA 



51 
I 

TLCVAYADLS 
GVPETIATLL 
HCTDLGNLLG 
DWKKRVKAI 
VH6AWSYNRV 
FTLGIFERSC 
DTVFDSGHAT 
STIWPTIPIA 
EEiETKSRVLG 
H6AVSQBEVI 



60 

120 
X80 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
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60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
B40 
900 
960 
1020 
1060 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 



Seq ID NO: 223 DNA sequence 
Nucleic Acid Accession #: BC017001 
Coding sequence: 1-394 



11 



21 



31 



275 
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I i I I I i 

AAOGCTOGGC AGQGCOGGCG 00GQTC3GGG0 OaOQCCOOAO GGOCCCQGGC OQAGOGOOSQ 60 

OGOQCAGGQC GGCAGCATCC ACTCGGGCOG CATOaCOQCG CSTGCACAAOG TGCO GCTG AQ 120 

COTGCTCATC CX3GCCGCTGC CGTCCGTGTT GGACCCOGCC AAGGTGCAGA GCCTCGTGGA ISO 

CACGATOCGG GAfiGACCCAG ACAGCGTQCC CCCCATCGAT GTCCTCTGGA TCAAAGGGGC 240 

CCAGGGAGGT GACrACTTCT ACTCXTTTTGO GGGCTGCCAC CGCTACGOGO CCTACCAQCa 300 

ACTGCAGOGA GAfiACCATCC OCGCCAAGCT TOTCCAGTCC ACTCTCTCAfl AGCIAAOOGT 360 

OTACXITGCSGA GCATCCACAC CAOACTTGCA GTAGCAGCCT CCrTTGOCACC TGCTQCCACC 420 

TTCAAGAGCC CAGAAQACAC ACCTGGCCTC CAGCAGGCTS GGCCATOCAO AAGGQATAGC 490 

AGGGGTGCAT TCTCTTTGCA CCTGGCXSAGA GGGTCTGACT CTGGGCACCC CTCTCACCGG 540 

CTACAAGGCC TTGOACTCAC TGTACAGTGT QGQAGCCCCA GTTCCCAC3CT CTGTOACAAT 600 

AGGATCATGO CCTTAOCCTT GAAGCATTAC CXSaaAAOGAG AACAGAOATG 6GCTTGAAGA 660 

GOCACGTGCT GCC3G6CTCCA AATTCCCAAO GACAAGGATC CCTCT6CATT TTTGTOTATG 720 

TAACCTCTTA TATGGACTAC ATTCAGCTGC AAGGAAAGQA AAACCTTGAT TGCAGTGGTT 780 

TAAACAAACA GAAGAITGTT TTTCCACATA GCATGGATTC TGGAGATGGG TQGCTAATGG B40 

TATTQGTTCA ACAACTCCAC GGAGGTAGGG GTCAOGTCTT GGATCCTTTT GCCTTAATCT 900 

CAGTOCTOST TACTTCATGG TCCCAAGATG GCTGCTGTAT CCCCAAGAAT CATGTCTGCX3 960 

TTCAAGGAAG GAQGG6TGGA GGAAGAGGAA G06CCAAACT AGCTGGACCC 6TCACCTTCT 1020 

ATCAGAAAGT AAAACCTCGT CAGAAGTCTG TTTCCTGCTC TCTCCCTCTG CATATCTTCA 1080 

CTTAGATGCC CTTOGCCCGA GCCAGCTACC ATTGCACCTC TAGCTQCAAA CAAAGCTAAG 1140 

ACAGCAGGGA ACAGAATTGT CATCGCTGAA TAGACCAATC GTGTTCCATC TACTGAGACT 1200 

OGCACACTGC CTCCTGCaiAT AAAACTGGGA TCCCATTAOC AAQAGAGAAA TGCAGAATTG 1260 

TGTACCA6TT AGCTTTTGCT GTGTAAC3UA CCATCCCCAA ACTTGGCAGC TAGAAACAAA 1320 

CCCTGTATTT TCCCACAATC CTATGGGTTG GCAATTTGGG CTGGGCTCAA CAGGGCAGTT 1380 

CTGCTGCTCA CACCTGGGAT CCCTCATGGA GCTAAGGTCA GCTGTTACCT CAGCTGGQCC 1440 

TG6ATGGTCT AGGATA6CCT TACTCACTTG CCTGGCAGGT GACAGGCTGT TGOCTGGAAT 1500 

T6CTT06TTC TCCTCCATGT GGOCTCTCCA OCAGGCTAGC TCAGGCTTAT TCACATGATG 1560 

OCTTCAQGAT TCCAAAGAGA GTQAOAGTAG AA6CTGAAAG ACTTCTTGAG TTCTTGGCCT 1620 

G6AACTGGGA CTAGGACAGT GrCACTTCTG CTAAGTTCTT TTGGTCAQAG CAAATCACAA 1680 

GGCTTTACCC AGATTCAAGG 6ATGAGAAAC AGACTACATO TCTTGATGAG 6GGAACCACA 1740 

AAGAGCTTGT GGCCATTTTT CACCTATCAC AAATAATTTT GGATQGGTAT TTATTT GGAT 1800 

AAAGOTATTT CTCTCTTCCC CCTTTCTCTC TGTCTCATGG GGCCTCACTC TGCCAAGTTG 1860 

GAAGGCACTA AGACATTOTC CTGGCCCTCA GGGTCTAGQG GAAGAGGTX3T TGGGGCAGGA 1920 

AGTGAGTCTC TCCATGGGCT GGACCCACTG TAGTAGGAGT GCCTCCrTOT CTQCAC TGCT 1980 

GGTATGGGGT TAGGCCAGGT AGGACATTCC AGAGGQGCTT CTGAAAACCA AflAOTOTTG 2040 

GGGAAAGGGA ACA6AGTAAG GCAGGCCTTG TTCTCACTGC CCTCTAAGGG AACTTOQTCA 2100 

CTOGGCaCrr TTAAGCCTCA GTTTCTCCAG TTCAATAATA AGGACAAGAG CTTTTCCCAT 2160 

GCATTCTCTT TCCCCGGGAA ACTTGACTGA GGTGACCAGT AATAGAATPQ AAAAGGQAGA 2220 

OTOTCTTCAG TGCAATGT6G CATCCTGGAT TGGGTCTTGG AACAAAAACA GGACXTTAGT 2280 

GGGAAAATTG GAAATCTCAA AAAAGTCTGA ATTTTAGTTA ATATACCAAT TTCAOTCTCT. 2340 

TOOTTTTGAC AGATCTACCA TGOTGATGTA AGATGTTGAC CTTGOaOTAG GCIGG0T6AA 2400 

GGGTATACAG GAACTCTTTO TACTATCTCT GCAACTTCTC TGTAAATCTA GTATCATTOC 2460 
AAAATAAAAG TTTATTTAAT TTAAAAAAAA AAAAAAAAAA AA 



8eq ID NO: 224 Protein sequence i 
Protein AccBssion AAH17001.1 

1 11 21 31 41 51 

TLGRA6AGRG APEX3PGPSGG AQGGSIHSGR lAAVHMVPIiS VLIRPIJPSVL DPAKVQSLVD 60 
TIREDPDSVP PIDVLWIKOA QGODYFYSFG GCHRYAAYOQ LORBTIPAKL VQSTIiSDLBV 120 
YLGASTFDLQ 

Seq ZD NO: 225 DMA sequence 
Nucleic Acid Accession #: NM_021048 
Coding sequence : 1 . . 1110 

I . 11 21 31 41. 51 

ATGCCTCGAG CTCCAAAGCG TCA6CX5CTGC ATGCCTGAAG AAGATCTTCA ATCCCAAAGT 60 

GAGACACA6G GCCTCGAGG6 TGCACAGGCT CXCCTGGCTG TGGAGQAGGA TGCTTCATCA 120 

TCCACTTCCA CCAGCTCCTC TTTTCCATCC TCTTTTCCCT CCTCCTCCTC TTCCTCCTCC 180 

TCCTCCTQCT ATCCTCTAAI ACCAA6CACC CXaGAOGAGG TTTCTQCTGA TGATGAGACA 240 

CCAAATCCTC CCCAGAGTCC TCAGATAGCC TGCTCCTCCC CCTCGGTCGT TGCTTCCCTT 300 

CCATTAGATC AATCTGATGA GQGCTCCAGC AGCCAAAAGG AGGAGAGTCC AAGCACCCTA 360 

CAGQTCCTGC CAGACAGTGA GTCTTTACCC AGAAGTGAGA TAGATGAAAA G6TGACTGAT 420 

TTGGTGCAGT TTCTGCTCTT CAAGTATCAA ATGAAGGAGC CQATCACAAA GGCAGAAATA 480 

CTGGA6A6T6 TCATAAAAAA TTATGAAGAC CACTTCCCTT TGTTGTTTAG TGAAGCCTCC 540 

GAGTGCATGC TGCTGGTCTT TOGCATTGAT GTAAAGGAAG TOGATCCCAC TG6CCACTCC 600 

TTTGTCCTTG TCACCTCCCT GGGCCTCACC TATQATGGGA TGCTGAGTGA T6TCCAGAGC 660 

ATGCCCAAGA CTGGCATTCT CATACTTATC CTAAGCSiTAA TCTTCATAGA GOQCTACTOC 720 

ACCCCTGAGG AGOTCATCTG GGAAGCACTG AATAT6ATGG GGCTGTATGA TGGGATGGAO 780 

CACCTCATTT AIOGGGAGCC OUSGAAGCTG CTCaCCCAAG ATTGGGTGCA GGAAAACTAC 840 

CTGGAGXACC GGCAGGTCCC TGGCAGTGAT CCTGCACGGT ATGAGTTrCT QTGQGGTCCA 900 

AGGGCTCATG CTGAAATTAG GAAGATQAGT CTCCTGAAAT TTTTGGCCSU^ OGTAAATGGG 960 

AGTGATCCAA GATCCTTCCC ACTQTGGTAT GAGGAGGCTT TGAAAQATQA GGAAGAGAOA 1020 

GCCCAGGACA GAATTGCCAC CACAGATGAT ACTACTGCCA TGGCCAGTOC AAQTTCTAOC 1080 
GCTACAGGTA GCTTCTCCTA OCCTQAATAA 

Seq ID NO I 226 Protein sequence: 
Protein Accession fts NP_066386 

1 11 21 31 41 51 

ilPRAPKRQRC MPBEDLQSQS ETQGLEGAQA PIiAVBEDASS STSTSSSFPS SPPSSSSSSS 60 



276 
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SSCYPI.IPST CEEVSAODET PNPPQSAQIA CSSPSWASb PLDQSOEGSS 8QKEBSPSTL 120 
QVLFDSBSLP RSEIDBXyTD bVOFLLFIOQ HSBPITKAEI IiBSVIKmrED BFPUiFSBAS 180 
BCHIiLVFGID VKBVDPT6H6 FVLVTSU3LT yCGHIiSDVQS NPKI6ILILI LSIIFIEQYC 240 
TPBEVIWEAI. NMHeUrOGMB HLIYGEPRKL lAODNVQENy IiBYBOVFOSD PARTEFUIQP 300 
? RAHAEIRKMS LUCFUUCVHG SDPRSFPIiHr BEAUCDBBBa AODRIATTDD TTAMUAS88 360 
ATOSFSYPE 

Seg ID MO: 227 OHA Sequence 
nucleic Acid Accession •< NH_00S025.1 
lU Coding sequence! 82-1314 ~ 
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15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 



1 
I 

GCGGAGOVCA 
GAGGCTTGAA 
AGTATGGCTA 
TATAATCGTC 
GCTCTTGCAA 
CACTCAATGG 
TCAAACATGG 
GTGCAAAATG 
GCAGCAGTAA 
TGGGTGGAGA 
GCTGCCACTT 
TTTA6GCCTG 
ATTCCAATQA 
QAAGCTGGTG 
ATGCTGGTGC 
CAGCTGGTTG 
AGGTTCACAO 
GAAATTTTCA 
TCCAAA6CAA 
GTCTCAGGAA 
CATCCATTTT 
GTCATGCATC 
TTATTT6AAT 
TAGGATTTGT 
AATATATGTA 
TGTTATGTCA 



11 
1 

GTCCGCOGAG 
ACTGTTACAA 
CAGGGQCCAC 
TTA6AQCCAC 
TGGGAATGAT 
GATATGACAG 
TAACTGCTAA 
GATTTCATGT 
ATCATGTGGA 
ATAACACAAA 
ATCTGGCCCT 
AAAATACTAG 
TGTATCAGCA 
GTATCTACCA 
TGTCXAGACA 
AAGAATGGGC 
TGGAACAOGA 
TCAAAGATGC 
TTCACAAGTC 
TGATTGCAAT 
TCTTTCTTAT 
CTGAAACAAT 
AACAABGAAA 
GTTTTACAGT 
AATTATAAGT 
TTGTGrrTTGT 



21 

1 

CACAAGCTCC 
TATGGCTTTC 
TTTCrCTGAG 
T6GIGAAGAT 
GGAACTTGGG 
CCTAAAAAAT 
AGAGAGCCAA 
CAATGAGGAG 
CTTCAGTCAA 
CAATCTGGTG 
CATTAATGCT 
AACCTTTTCT 
AGGAGAATTT 
AGTCCTAGAA 
GGAAGTTCCT 
AAACTCTGTG 
AATTGATTTA 
AAATTT6ACA 
CTTCCTAGAG 
TAGTAGGATG 
CAGAAACAGG 
GAACACAAGT 
ACAGTAACTA 
ATATCTTAAQ 
AACTTGTCAA 
GTGCTGTTGT 



31 
I 

AGCATCCCGT 
CTTGGACTCT 
GAAGCCATTG 
GAAAATATTC 
GCCCAAGGAT 
GGTGAAGAAT 
TATGTGATGA 
TTTTTGCAAA 
AATGTAGCCX3 
AAAGATTTGG 
GTCTATTTCA 
TTCACTAAAG 
TATTATGGGG 
ATACCATATG 
CTTGCTACTC 
AAGAAGCAAA 
AAAGATGTTT 
GGCCTCTCTG 
GTTAATGAAG 
GCTGTGCTGT 
AGAACTQGTA 
GGACATGATT 
A6CACATTAT 
ATAATATTTA 
GQAATGTTAT 
TTAAAATAAA 



41 
I 

CAGGGGTTGC 
TCTCTTTGCT 
CTGACTTGTC 
TCTTCTCTOC 
CTACCXAGAA 
TTTCTTTCTT 
AAATTGCCAA 
TGATGAAAAA 
TGGCCAACTA 
TATCCCCAAG 
AGGGGAACTG 
ATGATGAAAG 
AATTTAGTGA 
AAGGA6ATGA 
TGGAGCCATT 
AAGTAGAA6T 
TGAAGGCTCT 
ATAATAAGGA 
AAGGCTCAGA 
ATCCTCAAGT 
CAATTCTATT 
TGQAAOAACT 
GTTTGCAACT 
AAATAGTTCC 
CAGTATTAAG 
AGTACCTATT 



51 
I 

AGGTGTGTGG 
GGTTCTGCAA 
AGTGAATATG 
ATTGAGTATT 
AGAAATCOGC 
GAAGGAGTTT 
TTCCTTGTTT 
ATATTTTAAT 
CATCAATAAO 
GGATTTTQAT 
GAAGTCGCAG 
TGAAGTCCAA 
TGGCTCCAAT 
AATAAGCATG 
AGTCAAAGCA 
ATACCTGCCC 
TGGAATAACT 
GATTTTTCTT 
AGCTGCTGCT 
TATTGTOGAC 
CATGGGA06A 
TTAAGTTACT 
GGTATATATT 
AGATAAAAAC 
CTAATGGTCC 
GAACATGTG 



Seq ID MO: 228 Protein sequence: 
Protein Accession 1^: NF 005016.1 



1- 
I 

MAFLGLFSIiL 
ELGAQGSTQK 
NBEFZiQMMKK 
INAVYFKGNH 
VLEIFYBGDE 
IDLKDVLKAL 
SRMAVLYPQV 



11 
I 

VLQSMATGAT 
EXSK3KGYDS 
YPNAAVNHVD 
KSQFRFENTR 
ISMMLVLSRQ 
GZTEIFZKDA 
IVDHPFPPLI 



21 
I 

FPEEAZADLS 
LKNGEEFSFL 
FSC3NVAVANY 
TFSFTKZ30ES 
BVPIATLBPL 
NLTGLSDHKE 
RNRRTGTILF 



31 
I 

VNMmRIiRAT 
KEFSNMVTAK 
INKWVENNTN 
EVQIFMMyQQ 
VKAQLVBEHA 
IFLSKAIHK6 
KQRVMHFBTM 



41 

1 

GSDBHILFSP 
BSQYVMKIAN 
NLVKDLVSPR 
GEFYYGEPSD 
NSVKKQKVEV 
FLEVNEEQSS 
NTS^iDFBBL 



51 
I 

LSIALAHGMM 
SLFVQN6FHV 
DFDAATYLAL 
GSNEAGGIYQ 
YLPRFTVBQB 
AAAVSGMJAI 



Seq ID NO: 229 ONA sequence 
Nucleic Acid Accession #« KM_003695 
coding sequence I 12-398 ~ 



C33ACATCAGA 
CAGCCCTTAC 
TCTGCC06GC 
ATCTGGTQAA 
TCAGCAGCGG 
ACAACGCTGC 
TGAGCCTCCT 

TGATGCcrrrr 

GGGTGCCAGG 
CATGGAAT6C 
ACAQAGGATG 
GATTTCACAC 
TAAATGATTT 



11 

1 

GATGAGGACA 
CCTGCGCTGC 
CAGCTCTCGC 
GAAGGACTGT 
CACCAGCTCC 
ACCCACCCGC 
GGCOGTCATC 
CCTTCCCTTT 
AOCGCCAGGC 
TGATGACTT6 
CAGCCCCCAO 
TCCTTCTGTT 
AAACC 



21 

! 

GCATTGCTGC 
CACGTGTGCA 
TTCTGCAAGA 
G0GGAGTO6T 
ACCCAGTGCT 
ACCGCCCTCG 
TTAGCCCCCA 
CTCTGGGGAT 
TGAQGGCTTC 
GAGCAGGCCC 
CTGCATGGAA 
TTGTTGCCGT 



31 

I 

TCCTTGCAGC 
CCAGCTCCAG 
CCAGGAACAC 
GCACACCCAG 
GCCAQ6AGGA 
CCCACAGTGC 
GCCTGTGACC 
TCXaCACCTC 
CXXXMAAGTC 
CACAGACCGC 
GGTGGAG6AC 
TTATTTT6TA 



41 

I 

CCTGGCTGTG 
CAACTGCAAG 
AGTGGAGCer 
CTACACCCTG 
CCTGTGCAAT 
CCTCAGCCTG 
TTCCCCCCAG 
TCTTCGCCAG 
TCGGAGCftOQ 
ACAGAGQATO 
AQAAGCCCT6 
CTCAAATCrc 



51 

I 

GCTACAGGGC 
CATTCTGTGG 
CTGAGGGGGA 
CAAGGCCAGG 
GAGAAGCTGC 
GGGCTGGCCC 
GGAAGGCCCC 
CCGGCAACG6 
TCCAGOTGOQ 
AAOOOVCCCC 
TGGATCCCCG 
TACATGGAGA 



Seq ID NOt 230 Protein sequence: 
Protein Accession # : NF 003686 



1 U 21 31 41 51 

) I I I I i 

MRTAIiLIiIiAA LAVATGPALT LRCHVCTSSS NCKHSWCPA SSRPCKTTKT VEPLRGNLVK 
OU KDCAESCTPS YTLQGQVSSG TSSTQCCQED LCNEKLHNAA PTRTALAHSA LSLGLAItSLIi 
AVILAPSL 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 



60 

120 
180 
240 
300 
360 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 



60 
120 



Seq ID NO: 231 DNA sequence 
oD Nucleic Acid Accession #: Eos sequence 
Coding sequence: 126-752 



277 



wo 02/086443 

1 11 21 31 41 51 

111)11 

CCGGGCAGGT GGCTCATOCT CGGQAGCGTG GTTQAGCGGC TGQ0G06GTT GTCCTGGAGC 60 

AGGGGCGCAG GAATTCTGAT ffTGAAACTAA CA6TCT0T0A OCCCTGGAAC CTCCACTCAO 120 ' 

AGAAGATGAA GGATATGGAC ATAGGBUUkAO ASTATATCAT CCOCAOTOCr OGGTATAOAA 180 

GTGTOAGQQA GAGAACCAOC ACTTCTOGQA CQCACAOASA 0CX3TGAAQAT TCCAAGTTCA 240 

GGAGAACTCG ACCGTTQGAA TGCC3VAGATG CCTTGGAAAC AGCAGCCCGA GCCGAGGGCC 300 

TCTCTCTTGA TGCCTCCATG CATTCTCAGC TCAGAATCCT GGATGAGGAG CATCCCAAGG 360 

GAAAGTACCA TCATGGCTTG AGTGCTCTGA AGCCCATCCG GACTACTTOC AAACACCAGC 420 

ACGCAGTGQA CAATSCTGGG CTTTTTTCCT GTATGACTTT TTC3GTGGCTT TCTTCTCTGO 480 

CCOGTGTGGC CCACAAGAAG GGGGAGCTCT CAATG6AAGA OGTGTGGTCT CTGTCXAAQC 540 

ACGAGTCTTC TGACGTGAAC TGCAGAAGAC TAGA6AGACT GTGGCAAQAA GAGCTGAATG 600 

AAGTTGGGCC AGAOGCTGCT TCCCTGCGAA QGGTTGTGTG OATCrTCTGC OGCACCAGGC 660 

TCATCCTGTC CATOGTGTOC CTGATGATCA CGCAGCTGGC TGGCTTCAGT GGACCAAATT 720 
TTCAGGAT66 CTGTATTCT6 OG6TC3U3AAT GAGAGAGTGA AGCTG6GCAG AATCICTC6C . 780 

CAAGAOTTCA GCCTTCCTTT GGAGACTGCT CCATCAGTGC CX3AGGTGTGT GGGAACAGGC 840 

TTCACTGCAC CGCCATCTTA CTGAGTTGCT TCACGTGAGG AAAAGGGGGC TTTGGCCCTO 900 

TGACTCAGTT CCACATTTTG GATTGCATAC TGGAAAAGAA GCCAATCTTC TTGCTAGTAA 960 

ACCAGCAACX: aSGCTGTATA CAGTGGTGAC CCAAGC3UIT0 GATATAAACC TAAAAATCTG 1020 

AGGOAGQGGA ^^GGTGGAAT ACA6TAGTTC TTCGAATCTG AAGTCTCCTA TTTGATCRGO 1080 

TTATTTCCTG GGACTTGGCA AAAATCTGAT TGGTGGGGAT CTCCTAGGAC CTAGT GGACA 1140 

TCTGGTATTA ATTTAATCTC AGGAAAAACA AGAAATTAAC CCAQAGAGAG TCTGGGTTTT 1200 

GGAATTCAGC GTAGCTACCT CCAGACCGTG GTGTCTGGCC TCCATTTTTG TCTGTCATTC 1260 

AGCTCTGACI TACAGCTGCA GTCACCTTTG CTATAAGGC3V CCTGGGTAGA AGGGTGGATG 1320 

QGCTTCACAT CAATTTTTTT CTTCCTTTAQ GGTGGGGGAT TGGTTTGGCT TTCTTTTGTT 1380 

GlWi ' TT ' m GTTTTATTTT TGTCAAQATT GATTTTTAGA TGCAAGGACT TGAAAAGACC 1440 

CAGAAGGATG CCACCAGTTT TTCCTTGAGG CCTAGGATTT TTTATTCTGT CCOGAGCAGA 1500 

GGTAATTCXrr CACAACTTAG TGCACXaSTA GCACCAGCCA TTTTGAGCAG AGTA CCTCT T 1560 

TGGGQAOCTT TTCGTTTTGT TTTGTTTTTA ATTCTCTTTC CTTAGCA6CA AGGTCTTTTT 1620 

TCCT!AaAGAA TCTACTCOGT TGCAOAATCA TTGCAACCTC AGGA6CCCTC ACTSATTGAG 1680 

TGCTGTCAGC CTGATATACT ACTTTGGACT CTGGAAACAG ATATGGGTTC TATTCTCTAT 1740 

TTCTACTGTG TGTCGTTAAA CAACCGTCGG AGACCAGATG ACCTGTTA6A TGGCTAGTCC 1800 

TGTATAACrC GACTCTGTAT GTTTCAATGT ATGTTACTQC AATGCTTOIC CTGCT6TAGA 1860 
GTGTTTGT6A GATGCTCTTT GAAGATG6TA CTTTTATATT T 

Seq ID NO I 232 Protein sequence t 
PTOtein Accession #t Eos sequence 

I 11 21 31 41 51 

NKDZDZaXBY IzPSPGYRSV RERTST6GTH RDREDSKFRR TRPLECQDAL BTAARAEOLS 60 

LDASMHSQLR IU3BEHFK6K YHEGLSALKP IRTTSKBOHP VDMAOIiFSCM TPSmiSSIiAR 120 

VAHKKGELSM EDVWSLSKHE S5DVNCRRLB RIiWQEELNEV GPDAASLBRV VWIPCRTRLI 180 
LSrVCLMITQ LAGFSGPtTFQ DGCILRSE 

Seq ID NO: 233 DNA sequence 

Nucleic Acid Accession 8: CAT cluster 

1 11 21 31 41 51 

I I I I 1 I 

TTTTAATGGT GCTCATATAT ACTGTATTTT TTGTTGTTTA QTTTTACTTA TTGAQAGTGT 60 

CACAACATGA ATCACATAAT CATGATTTTT TTTTTTTACT TTTACTCCCC AAATTATTCA 120 

TOTTTCTTAG ATOGTA0TCA TTGAQAAGTC CCAATAACTC TAAACTTTTG AGTTA T AACG 180 

TAGTAAACTT CTCTTTCATC TTTGTQTTAG CTCTGTAGTC TTAACCTGGA TTTTAATTTT 240 

TTTGTTTCCA AAGTCACAAT TQAATTATTC TTAGATACCT TAAGOCACTG AATTCAGTTC 300 

TGTTTGACTG AAA6CAAAAC AAOGTGACAO TTTATTTTCA AAC ftCTA ACT TCTTGATATT 360 

TTGTTATGGT ATATCTTTTT ATTAAATATT TATTTTGACT AAGCTTTCAT AAAATATTTG 420 

AAGCXATTTT AATCATCAM TATGGAAAAC AAATTACTAT TGCATTTTCC TATATATGCA 480 

TATATTATGG ATTAACCAGA ATTGTATCAT TTTTGGCCTA ATGTCTGGAT ATAAAA6ATA 540 

ATTA6CCTAC TATAGTATTA ATAAATTTTT CAGTTGGTTT GGGCAAATTT AAACCT GAAA 600 

AATAGOTTAA AAA6TAGTTA CAAATTAAAC TTACTAATTT ATACCTQATT TTTTTTCTTG 660 

AATTAAAGTA CATTTTAAAT GAGCTTTATA ATACCTTAAA AAGTTGGTTC TAATTTAAAA 720 

TATQAAAGCT CTGGCTATCA TCCTQQQBVTA GTAATTTCTA ATTATATAGT ATTTCAAAAC 780 

TATATATTTT TTAGTTOCTT TQAQATAACT AATTTCTAA7 TATATATOTT TCAAAAACCA 640 

TATCCTGTAT TTTTTTTAAG AATTGTTTTA TAAATAGGTC ATAAGATACA AGGTCTGCAT 900 

TAGAAQACCC ACTCTTACTA G6TTCCOTAA G6ATCTGCCA TAGATTTTTT TTTTTTTTTT 960 

TTTTTTTTAO GTAGTTTAAA GCAAGCACTG ATACCAGTGG GAGTTGGTCT TG ATCTAG GA 1020 

GATTCTGTTA AGCATCCAAA AACAATGGCT AATTTC3U3TT CTTAGGTTAT GOCTTQTGAC 1060 

TCCA6ATAAA AGATGQAGAA TACCTCATGT ACTGTGACTT GAAAATOAAT TCTTAAAATT 1140 

CTTAGGCTCT CTCOITGTAT CTTTCTTAAG GAAAAGTTTC TGAGTGTGAT CTCTCTTTTG 1200 

CCATAGTATC AAGTGGAGGG TAGTTCAGAA AAGTTAATAG GAAATCTTTT GTGACAGCAG 1260 

ACTATAATAG AAGTTT6AGT AATATTTTAA TAAATTTATA TAATTCAAAT GATAAAAATO 1320' 

TATCakATGTT ATCCRATGAT TTTTATTAAA AAATTACCTT ATTATTAGAA CTGTGCCTAT 1380 

TACATAAAAA GTGCTCATGT ATTTGAATTT TAAATAATTT ATTTAAATCA AGACCACCAT 1440 

AAGTCATTAA TAATTTAATA ATOXSTTTTAA ATCAGTOGTT TTCSkACCCTC ACTTCATATT 1500 

AGAATCATCT GAGGACTTTT AATATGGAAT CX3^CCTCATA ACAATTAAGT CTAAATTTCT 1560 

GQAAGATGGA GCCATGCTTG TTTTTCCAAA AGCTCTTTGA GTGATTCTAA TTTGTAGTCA 1620 

GAGTTGAAGA CCACT6CTCT AAATTAGTGC AGGAAAATGC TTTTATTTCT CCCATGTTAA 16B0 

CTTTTAAAAC TAGTAATGTA CCCAGTTAAG TTTT6ATGGT TTAAATTCCA CTAAAGAACA 1740 

TATTCTTCTA ATAACTAGCA TTTATTACAT GAAATTTAAG AGTTTAAOTT CCATCAAACT 1800 

AGCCCTTGTG TAAGATTATT ATTTCTTCTC TATAACTTCA AAATASATAT TT CATTC AAA 1860 
CIGTTCAGGT GAGAAAACAT AATGGATTTT TTTTTTTTTC CTCTGGAGCT GCCTGTTCAG * 1920 

TGAGATGGAG GAGGTGGGCA CATTTAAGGT CAGTTCACTA ACCTATGGTT CAGAGTTCTG 1980 

ATCATATGGA AGTTTGGAAA AGAGAGCTTA TCACAGGTTT GTATGCTGGT GAAT GGATA G 2040 

TTTTAATTCT CACTGTCTCA AAAGAGAATC AGCTCTCCAG CAGTTCTAGA AAAGCTTTGA 2100 

CAATCXXXyvA GOGQCAGTQT TACCTTACTC CTTCACTOCr TCTTAGAAGG TAGAA TTAAG 2160 

TTTCTGQAAT TGCACCTACA TGTTTTCTTA TTAACATTCA GAATTGG6AA TATTAATTTT 2220 
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PCT/US02/12476 



5 
10 
15 
20 
25 
30 
35 
40- 
45 
50 
55 
60 
65 
70 
75 
80 
85 



TOCAGTGAOT 
CAATTTTGTG 
AAAACCTCAT 
TTTAAAGATG 
CTTCAGAAAT 
AGATG6TATT 
TTG6AA6C0G 
ATTTTATATT 
CTTCTTAAAA 
ATAGAGATTC 
AAGACATTAA 
CX3U3VTTAAA 
TGT6CCT0GT 
CCTTCATCAA 
ACAACATCTG 
CCCCAAACAC 
CACACAACCA 
GACAACACAT 



AGTTTTCTGA 
TTTGTTTACT 
GCCTTTTCAT 
CTTTAATGAA 
CCATATATTT 
TAAAAT6AAT 
CCA6CCATTC 
ACTAT66TAT 
CCATAACXTTG 
TTCTTTTATG 
CATAAGTCTC 
CAACCACX3GC 
ATOGCCTCTG 
GCACTTGCCA 
CAACTCTACC 
AAAACCACTA 
ACACACCACXS 
CACATACACT 



AATTGGTAAC 
TTTAT6TAAA 
TACATCTAAT 
AAGTATTAAG 
GTCATATTTA 
GCCCAAAAAT 
ATGTAjQAGAG 
CTGTtSTACCA 
GCTTGCCTTT 
AAGAAGAGCT 
TGAGCA6TGA 
AACACTCAGA 
6CATAACTTA 
ACACATTCAC 
CTATCAACTQ 
AATCATAACX: 
ACCAAACACC 
CACTACCCCC 



TTG6AGA6TA 
AATTTOATAT 
TTGAACTCTC 
AAAATATATA 
TTTTTTTAGA 
ATCTTGTACC 
TTTATAAGAA 
TATTTCTAAG 
TAGTGTTAAA 
GACGTAATTT 
TACATTTTGA 
CTTGGCACTT 
C3VCGAAT0QT 
CTCTAACTTG 
CCAACCTAAA 
ACCACACACG 
GCACCACAAA 
CCATACTCCC 



AAATAACX3TA 
6T(SAATTACA 
AACTTCAGTG 
GATTTGTATG 
AACCTCCTAA 
TTT6TCCAAA 
AATAATTTAA 
TATTCATTAT 
CACAAAATCC 
ATTACCAGTG 
AACATQAAGA, 
TCCTAGQAAT 
CCTCCCTACT 
TACAACCTTA 
GACCXXTCAAC 
CCACACACCA 
CAAGCTAACA 
ACOCACCA 



TTTTQCTTTT 
CAGTTCTAAT 
CCAGAAGTGC 
TCAGTTTATA 
TTGGATAACT 
A6TTTATCTG 
AATTGTATGC 

taaattggta 
aacattgtat 
catctgcaca 
gtgacaacca 
gcatcx:tata 

T6TCTAG6CT 
CCAACTCACC 
ACAACACAAC 
CACACCCACC 
AOCACAAACA . 



Seq ID NOi 234 DNA sequence 

Nucleic Acid Accession #: Eos sequence 

Coding sequence: 27-281 



AGCAGGAGGA 
GTCTACGGG6 
TTCCTGCCXC 
GGCGCTCACT 
CAGGGTTCCG 
GAGTTTTTCT 
CAGAAAGAAT 
ATGTTGCAGG 
TTTTCTTCTC 
ATAAAAACIG 
GATGCCAOGA 
TCAAGCCAAG 
CTCAAACCCG 
GGTCAGAACA 
A6ACAGCCTG 
TCrCAGTATC 
CATAAACACA 
GGAAAGGTCT 
TCTTTTGATG 
GCAAAGAAAA 
TTTATTTTTA 
TTTGACCTTG 
GATTAAAACA 
AAATCTCAAG 
AAAAAAAAGC 
GCACTTCTCT 
TCCCATCAAA 
ATAATOCC 



11 

1 

GAGCTGGCGG 
TA6CAGTTAC 
GTGCTCATTT 
GAAACAGTGT 
ACCAATCCAA 
TTGCTCTGAT 
GCAAGGAGAT 
AAGAGCTAGT 
CACTTGGCAT 
TTCAaOGGTT 
GGAAAGATGC 
CCAACAGGTG 
GGGAAGCXrCA 
CA6CTAAGCA 
TGACX3TTTCA 
TTACGCCCAG 
TAACAGCA6C 
CCTGTGACTG 
ACTCTATATC 
ATAAAAGACA 
AACTGAATTC 
AAA7AATCTT 
TATTAGTAAT 
GCTTTTAAAG 
CCTCCATCTG 
TCTCATTTTC 
GCCAAAOAAA 



21 

1 

GAAGACATQC 
ATCAGACTGA 
GGGGCTGACG 
6TTGCTGCAC 
GAGCCTTGCA 
CTTGQA6ACA 
AGACCAACGT 
CTTTCAGGCT 
ATCAAGAGCC 
CGCCAACAAG 
CAGGGGTAAA 
TTCTGTTTTT 
CTCTAGAACC 
GATGGCTTGG 
AAA8CAAAAG 
TGACACGATC 
AGCAATAATT 
TTTTATTTTT 
CAACTCTGAG 
ATTTCGAOTA 
AGCAGAGATT 
TACATTGTAA 
taattattaa 
catttgtaca 
ATTCTCATTT 
CACTGTCTG6 
GAAAAGAAAA 



31 
1 

ACCCCTTGAA 
GACACTTCCT 
CCATTTTAGG 
ACCGCCTTGT 
GAAAGCATTA 
TCX:CTCTGCC 
GAGATTCTCC 
GGGCTGGTGA 
A6GCGT6GAA 
AA6TGGTAAA 
GTGGGAAAAT 
CATCACAGAA 
CATGCTGGTC 
GTCATCAGGA 
TCCOCTACCA 
TAOCCTC!AAA 
AAAGATGAGA 
AGGGAAACAG 
6TTT6ATTAA 
AGTATGCCAG 
TACATGCATT 
ATTCTTAATG 
AGQAGAATAA 
AATGACTGGA 
TCATXGTCAG 
CAAQC TAGA A 
TTGTTCTGTA 



41 

I 

GACCCAGAGA 
GTTTACAGGA 
CCTCA6CCCA 
TTTGCTTGTT 
ACGTGCTTTT 
TAGTGGAAAC 
TTCATGCACT 
CCTGAGAAAG 
6ACTAAAACA 
GTAGCAAAAA 
GGGAACCTGA 
CTAATAAGTG 
ATCCATATCC 
CX5TCCATTAC 
GCCAGTQAAG 
ACTTAAAAAA 
TGAGAACAAT 
AGAGGAAGAA 
A6AAATQACC 
TTCX3AATTAA 
ACGAT6ATTA 
ATCAAAACAA 
TTGCAAATAC 
CATTTTTTAA 
T6CAACAACA 
ATTCTCACX3A 
CA8ATATAT8 



Seq ID NO; 235 Protein sequence i 
Protein Accession #: Eos sequence 

1 11 



21 31 41 51 

I i I I i I 

MHPIiKTQREA VCLPRSSYIR LRHFLFTGDY KIPAFCSFGA I1AIL6LSPSA PRRSLKQCVA 
PKRLVLLVGA LS6FRPIQEP CRKH 

Seq ID MO: 236 DNA sequence 
Nucleic Acid Accession #: NM^002075 
Coding sequence: 406..142B ~ 



CCACAATAGG 
ACAGGATCAG 
A6TCCTTTCT 
ACCACCCTGA 
6GCCA66CCA 
CGTCGCAGCT 
CCAGCCAGAG 
CAACTGCGTC 
GCTGACGTTA 
CGGAOGCGGC 
GATTCTAAGC 
ACCACCAACA 
GCCCCATCAG 
CTCAAATCCC 
CTCTCCTGCT 
TGTGCCTTGT 
GACTGCATGA 
GCCAQTGCCA 



11 

1 , 

GGCAGACCTG 
ACCCA6AGGC 
AATCTCAGCT 
GCTGAOGAGC 
GGCCAGCTCC 
GAGGGAGTAA 
CCCAAGAGCC 
AGGAAGCGGA 
CTCTGGCAGA 
GGACGTTAA6 
T6CTG6TAAG 
AGGTGCACGC 
GQAACTTTGT 
GTGAGGGCAA 
GCC6CTTCCT 
GG6ACATT6A 
OCCTGGCTGT 
A0CTCTG6GA 



21 

i 

TCCATCCTTC 
AGCTGGTTGG 
CCTGCCTGTA 
ACAGTTTGAG 
TCTGGCAGCA 
GGAGGCTCCC 
AGAGTGACCC 
GCAGCTCAAG 
GCTG6TGTCT 
GGGACACCTG 
TGCCTCGCAA 
CATCCCACTG 
GGCATGTGGG 
TGTCAAGGTC 
GGATGACAAC 
GACTGGGCAG 
GTCtCCTGAC 
TGTGGGAOAO 



31 
I 

TCTGTGGGTC 
GGTTTGTOGA 
CCCTCCCATA 
GCCCCCCCAA 
GA6CCT66GC 
AGGAAGCGGA 
CTCGACCTGT 
AAGCA6ATTG 
G6CCTAGAGG 
GCCAAGATTT 
GATGGGAAGC 
CGCTCCTCCT 
GGGCTGGACA 
AGCCGGGAGC 
AATATTGTGA 
CAGAAGACT6 
TTCAATCTCT 
GGOACCTGCC 



41 

1 

CCCTGTACCT 
GAAGAAGGAT 
CTCACCAAAC 
OCCCCCGCCG 
AGGX6ACGGG 
GCTGGAAACC 
CAGCCATGG6 
CAGATGCCAG 
TGGTGGGACG 
ACGCCATGCA 
TGATCGTGTG 
G6GTCATGAC 
ACATGTQTTC 
TTTCTGCTCA 
CCAGCTCGGG 
TATTTGTGGG 
TCATTTC666 
6TCAGACTTT 



2260 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 



51 

1 

GAGGCCGTCT 
GACTATAAAA 
TCTGCACCCA 
GGOGCGCTCT 
CTCTTTGGCA 
ATAAGGAATA 
CAA6AGAAAG 
AATGTCCAGC 
GGAAATGTTT 
TGGGGATGSA 
AGCCA6GAGG 
GTGCTGAGGA 
CCAAGGCCCT 
ATCCAAAOGA 
CIACCTGATT 
AAAAGQGAAA 
TAAGAAAAAA 
GAATGATTTT 
TTGAACCACA 
TGATTTACTT 
ACATCTGAAA 
GGTTCTCAGT 
AACATTCCTA 
ATTTGAAAAA 
AAAAAGGTAT 
CTACCTTTSA 
ACATTAAAAA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 



60 



51 

1 

TTCTCCCCCA 
TATCCAGATC 
CCTCTTCCCC 
GTC6GGGCCA 
CGGGC6GGG6 
CGGCCGAGGT 
GGAGATGGAG 
GAAAGCCTGT 
AGTCCAGATG 
CTGGGCCACT 
GGACAGCTAC 
CTGTGCCTAT 
CATCTACAAC 
CACAGGTTAT 
GGACACCACG 
ACACAGGGGT 
GGCCTGT6AT 
CACTGGCCAC 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
640 
900 
960 
1020 
1080 



279 



wo 02/086443 



PCTAJS02/12476 



10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 



TCAACGCCAT 
CCTGCCXSCTT 
TCATCTGOGG 
AOGACSGACTT 
CTGGCCACGA 
CAGGTTCCTG 
GAAGGCAGTG 
TCTATATTCC 
aACTGTGCCT 
CCATGGCCTT 
CCCCTCCCCA 
TCCTCCCCCIi 
GACTATGGCT 
TCTACCTTTT 



GAGTC6(3ACA 
GATGACGCTT 
CACGAGAGCA 
TTOGCTGGCT 
GGCATCCTCT 
GCTGTGGCCA 
AAGGGAAGTG 
GGTGTTCTCT 
G66AGCATG0 
CCATCTCCTC 
GOACAACCTG 
QCCCTAGGAT 
TTGGCCCTGT 
TTCTCCTTTT 
GT 



Seg ID NO: 237 Protein sequence: 
Protein Accession #: !IP_002066 



CTGTTTCTTC 
GTTTGACCTG 
CATCACGTCC 
CAACTGCAAT 
TAACAGGGTG 
GGACA6CTTC 
AACACACTCA 
GGGTGCCATT 
TTGGGA6GCA 
CCCTCCCCAC 
GCCCTTT6CA 
GAQCCACTAC 
CTQGCACCAC 
TTTCTCTCCT 



CCCAATG6AG 
CGGGCA6ACC 
6TG6CCTTCT 
GTCTGGGACr 
AGCTGCCTGG 
CTCAAAATCT 
OCAGCCCCCT 
CCCACTAAGC 
QCATCAGGGA 
AGTCCTCACA 
6GCCCAGCA6 
CTTTGTCCAG 
TAG6GTCCT0 
AAGACACCTO 



AGGCCATCTG 
AGGAGCTOAT 
CCCTCAGTGG 
CCATGAAGTC 
GAGTCACAGC 
G6AACTGA6G 
GCCOQAOCCC 
TTTCTCCTTT 
CACAGGGQCA 
GCCTCTCCCT 
ACTTGAGTCT 
GCCTG66T6G 
OCOCTCTTCT 
CAATAAAGTG 



CAGGGGCTOG 
CTGCTTCTCC 
CCGCCTACTA 
TGAGCGTGTO 
TGACGGGATG 
AGGCTGGAGA 
ATCTCATTCA 
GAGG6CAGTG 
AAGAACTGCC 
TAATGAGCAA 
GAG6CCCCAG 
TATAGGGCGT 
TATTCAT6CT 
TA6CACGCTG 



1 11 

I I 
KQEMBQLRQE AEQIiKKQIAD 
MBWATDSKLL VSASQDGKLI 
CSIYMXiKSRE GNVKVSRELS 
VGHTGDCMSL AVSPDFNIjPI 
ICTGSDDASC RZfFDLRADQB 
KSERVGZLSG HDNRVSCLGV 



21 31 41 51 

I I I I 

ARKACACVTL ABLVSGXiBW GRVQMRTRRT LRGHLAKIYA 
VHDSYTTNKV HAIPLRSSWV MTCAYAPSON FVAOGGLDNM 
AHTGYLSCCR PLDDNNIVTS SGDTTCALWD lETGQQKTVP 
SGACDASAKL WDVRB6TCRQ TPT6HESDDI AICFFPN6BA 
LICPSHESII OGJTSVAPSIj SGRLLFAGTO DPHOWWDSM 
.TADOAVATG 8WDSFLKIHN 



Seg ZD NOs 238 DMA sequence 

Nucleic Acid Accession CAT cluster 



1 

I 

TCCCAATGTO 
TACCATTTGC 
ACTCATTCTG 
TGCATTGACC 
TAAGTQ^CTG 
TTTTQACGAG 
CAAAAAGGGG 
AAGAAAGAAA 
GGATCAAGAA 
CAGAAGTGAA 
TAGAAAAGTT 
CAACTACTCA 

crroxGTTCC 

ACTGTTOTTT 
GACATQAAAG 
TAGGCTAAGT 
AAAATCOCAA 



11 

1 

TNGAACCTAC 
TTTTAAGGCA 
TGAATTATGA 
A6TGTGAAGC 
GAAAGCTGAA 
TATCGGGTGA 
AAAAAAAA6A 
AATAAAATAC 
GTTTGTGTAC 
TCAAAATATT 
TTTCTGTAAA 
ACTTTCCTAC 
AATAAAGCTT 
OCCAAGTCXn' 
TTCATTGGGT 
TATAATACAC 
ATAAAA 



21 

CATAAATTCT 
GATAATCCTC 
AATCTGAAAA 
ACAGTGOAAT 
GAATCACOGO 
CTTTGAGGTG 
GCAACCAAAG 
ACAATATGGA 
ACATAATCTC 
TCAAAATGCT 
AGTCAGATAG 
TGTAGCACAA 
CATTTACAAA 
AATATAGTTG 
TGCTAAAAAG 
TGTTTTAACA 



31 
I 

TTTCTTAOJG 
CAA6TTTTCT 
GQAATTGGAA 
GAGAATGC6T 
CTTCAGTGAC 
GTCAAGAAAC 
AAAAAAAATC 
OGATGGAGAA 
ATTTTGAOAT 
GTCTTATGAA 
TAAATATTTT 
GAGTAGCTGT 
AACATGCCAT 
CTTAGCAAGT 
TATGTAGAAA 
ATTGTAAAAT 



41 

I 

GACAATCTTA 
AATGATATCT 
GTTGCTAAAA 
GCCCTGACAC 
ATG6AACCCA 
CACACTTTAA 
CATAAAATTG 
AAACAGTTAC 
ATATAACTAT 
ACTACAATAT 
AGGTTTTGCA 
QGTACTGTGC 
GGGCCATATT 
ATTGTOAGCT 
TTCAAAGGAA 
GTAAGAGAAA 



Seg ID KOt 239 DNA sequence 

Nucleic Acid Accession #: NM_001786.1 

Coding sequence i 130-1023 



1 

1 

GGGG66GGGG 
QTTGTTGTAG 
TQACTAACTA 
6TGTATAAGG 
6AAA6TGAAG 
CTTCSTCATC 
CTCATCTTT6 
CAGTACATG6 
TTTTGTCACT 
GACAAAGGAA 
AGAGTATATA 
TCAGCTOGTT 
GCAACTAAGA 
AGA6CTTTG6 
AAQAATACAT 
GAAAATGGCT 
GGCAAAATGG 
TAGCTTTCT6 
AACTCTTGTC 
6TCTTCTAAT 
ATTCTGTAAA 



11 

I 

GGCACTTGGC 
CTGCCGCTGC 
TGOAAGATTA 
GTAGACACAA 
AG6AAGGGGT 
CAAATATA6T 
AGTTTCTTTC 
ATTCTTCACT 
CTAGAAGAGT 
CAATTAAACT 
CACATGAQGT 
ACTCAACTCC 
AACCACTTTT 
GCACTCCCAA 
TTCCCAAATG 
TGQATTTGCT 
CACTGAATCA 
ACAAAAAGTT 
TATTTTTGTC 
TTCAAAAATA 
TGTGAAAAAA 



21 
I 

TTCAAAQCTG 
GG006CGGGO 
TAOCAAAATA 
AACTACAGGT 
TCCTAGTACT. 
CAGTCTTCAG 
CATGGATCTG 
TGTTAAGAGT 
TCTTOICAGA 
GGCTGATTTT 
AGTAACACTC 
AGTTGACATT 
CCATGGGGAT 
TAATGAAGTG 
GAAACCAQQA 
CTCOAAAATG 
TCCATATTTT 
TCCATAT6TT 
TTATATATAT 
TAACTTAAAA 
AAAAAAAAAA 



31 

I 

GCTCTTGGAA 
GAATAATAA6 
GAGAAAATTG 
CAAGTGGTAG 
GCAATTOGGG 
GATGTGCTTA 
AAGAAATACT 
TATTTATACC 
GACTTAAAAC 
GGCCTTQCCA 
TGGTACAOAT 
TGGAGTATAG 
TCAGAAATT6 
TG6CCA6AAG 
AQCCTAGCAT 
TTAATCTATG 
AATGATTTGG 
ATGTCAACAG 
TTCTTTGTTA 
ATGTAAATAT 
AAAAA 



41 . 

I 

ATTGAGCGQA 
CCGGGATCTA 
6A6AAG8TAC 
CCAT6AAAAA 
AAATTTCTCT 
TGCAGGATTC 
TGGATTCTAT 
AAATCCTACA 
CTCAAAATCT 
GAGCTTTTGQ 
CTCCAGAAGT 
GCACCATATT 
ATCAACTCTT 
TGGAATCTTT 
CCCATGTCAA 
ATCCAOCCAA 
ACAATCAGAT 
ATAGTTGTGT 
TCAAACTTCA 
TCTATATGAA 



1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 



51 

1 

TNCTAANCAA 
GAAACTATTA 
ATCTATCATT 
CAAA6AAAAA 
GTGATTTGAT 
GAACAATGTC 
CACAGAAQAA 
ATTTCTTTAT 
r m HTfCTTT 
TCTCACAGAT 
CTGTCTTTTG 
AAATAAATT6 
TG6CCTOTAC 
ATTTGAGGAA 
AATTAAAATT 
TTTACAAATA 



51 
I 

GAGCGAGGCG 
CCATACCCAT 
CTATGGAGTT 
AATCAGACTA 
ATTAAAGGAA 
CAGGTTATAT 
CCCTCCTGGT 
GGGGATTGTG 
CTTGATTGAT 
AATACCTATC 
ATTGCTGGGG 
TGCTGAACTA 
CA6GATTTTC 
ACAGGACTAT 
AAACnQQAT 
ACGAATTTCT 
TAAGAAGATG 
TTTTATTGTT 
GCTGTACTTC 
TTTAAATATA 



Seq ID NO: 240 Protein sequence: 
Protein Accession ft: NP_001777.1 



60 
120 
180 
240 
300 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 



85 1 



31 



51 
I 



MEDYTKIBKI GBGTYGWYK GRHKTTGQW .AMKKIRLESE EEGVPSTAIR BISLLKELRH 
PNIVSLODVL MQDSRLYLIP EPLSMDLKKY LDSIPPGQYM DSSLVKSYLY QILQGIVFCH 



60 
120 



280 



5 

10 
15 
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25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



WO 02/086443 

SRRVLHRDLK PQNLLXDDKO TIKLADFGIA RAFGIPIRVY THEWTZiHYR SPEVLLGSAR 
YSTSVDIHSI GTZFAEUVnC KPIiFBGDSEI DQLFRZFSAL GTPNNBVHPE VESLQDYKNT 
FPKWKFGSIA SRVKNLDENG LDLLSKWilY DFAKRIGGKN ALNHPyFNDL 13HQIKXM 



seg ID NO: 241 ONA sequence 

Nucleic Acid Accession #t HN_033379.1 

Coding sequence: 132-854 



PCT/US02/12476 



180 
240 



1 
I 

GGCG0GO6OQ 
GCTTTGCAGA 
ATTGACTAAC 
TTGTGTATAA 
TAGAAAOTGA 
AACTTGOTCA 
AICTCATCXT 
GTCAGTACAT 
TATTGCIGGG 
TTGCTGAACT 
TCAGGATTTT 
TACAGGACTA 
AAAACTTG6A 
AACGAATTTC 
TTAAGAAGAT 
TTTTTATTGT 
AGCTGTACTT 
ATTTAAATAT 



11 

I 

OQGGCTCAAC 
GAG08CCCTC 
TATGGAAGAT 
GGGTAGACAC 
AGAOGAAGGG 
TCCAAATATA 
TGAGTTTCTT 
G6ATTCTTCA 
GTCAGCTCGT 
AGCAACTAAG 
CAGAGCTTTG 
TAAGAATACA 
TGAAAAT6GC 
TG6CAAAATG 
GTAGCTTTCT 
TAACTCTTGT 
CGTCTTCTAA 
AATTCTGTAA 



21 

■1 

TTTGTAQAGC 
CAGGGACTAT 
TATACCAAAA 
AAAACTACAG 
GTTCCTA6TA 
GTCAGTCTTC 
TCCATGGATC 
CTTGTTAAGG 
TACTCAACTC 
AAACCACTTT 
GGCACTCCCA 
TTTCCCAAAT 
TTGGATTTGC 
GCACTGAATC 
GACAAAAAGT 
CTATTTTTGT 
TTTCAAAAAT 
ATGTGAAAAA 



31 
I 

QAGGGGCCAA 
60GTG0GGGG 
TAGAGAAAAT 
GTCAAGTGGT 
CT6CAATT0G 
AGGATGT6CT 
TGAAGAAATA 
TAGTAACACT 
CAGTTGACAT 
TCCATGGGGA 
ATAATGAAGT 
GGAAACCAGG 
TCTCGAAAAT 
ATCCATATTT 
TTCCATATGT 
CTTATATATA 
ATAACTTAAA 
AAAAAAAAAA 



41 
I 

CTTGGCAGAG 
ACACGGGATC 
TGGAGAAGGT 
AGCCATGAAA 
GGAAATTTCT 
TATGCAGGAT 
CTTCGATTCT 
CTGGTACAGA 
TTGGAGTATA 
TTCAGAAATT 
STGGCCAGAA 
AA0CCTA6CA 
GTTAATCTAT 
TAATGATTTG 
TATGTCAACA 
TTTCTTTGTT 
AATGTAAATA 
AAAAAA 



TACCCRTACC 
ACCTATGGA6 
AAAATCAGAC 
CTATTAAAGG 
TGCAGGTTAT 
ATOCCTCCTG 
TCTCCAGAAG 
GGCACCATAT 
GATCAACTCT 
GTQGAATCTT 
TCCCATGTCA 
GATCCAGCCA 
GACAATCAGA 
GATAGTTGTG 
ATCAAACTTC 
TTCTATATGA 



Seq ID NO: 242 Protein sequence: 
Protein Accession #: NP_2 03698.1 



11 



21 



41 



51 



6AGCAACCTC 
CGACCCAGAG 
GCGGGGCCCA 
ACCTGCCACC 
6CTGTTGGGC 
GCCCCAGTGG 
CQAGGGGCTG 
TGACTCCTTG 
CATCCTCCTG 
CTTGGAAGAC 
TCTTGCAGGT 
ATTCTATGAC 
T6GCTGGGCT 
COGAAAAACA 
GAAAGACTAC 
GGACATTGAG 
QTATG6TATT 
AAACATGGCT 
TTGTATTACT 
TATATATAGA 
CTCATTATGT 
CCATATTGAT 
CAGTCAAATA 
CTAATTTACC 
TTATTTTTTA 
TTTCATXGGT 
ASCCAAGAA6 
GT6ATAAATT 
TTTGCTTTGA 
CACAACTTTA 
ACCTTTTTGT 
TATATCTTCC 
GATAATCTGG 
TCTTTTTTCT 
AATATTAATT 
TTTATTTGCT 
CTTCATGTGA 
ACACATACCT 
AAACCTAGGC 
ATTCTTTCAG 
TTTCCAGTCT 



11 

1 

AGCTTCTAGT 
CTTCTCCAGC 
GCCACCTTCG 
CCTGAGCCA6 
TTCATTCTOO 
AGGATTTACT 
TGGATGTCCT 
CTGAATCTGA 
GGAGTGATAG 
GATGAGGTGC 
CTGGCTATTT 
CCTATGACCC 
GCTGCTTCTC 
ACCTCTTACC 
GTGTGACACA 
ATACTATCAT 
ACAAAACAAA 
TAATCTTATT 
GCTTCOCATT 
TATGTATATA 
TGATACTAGC 
GAAGATGTTT 
TCATTTACTC 
AAGGATGAAT 
CCATAATCTT 
CTCTATCTCC 
AATTTATTAC 
CCTGTT6ACC 
AAATATTTGT 
TTGATTGAAT 
TCCCCATTCC 
TAATAAG6TG 
TGACAAATAT 
ATCT6CCAAA 
AGTTTATATT 
CAGCTGGCTG 
TTCACTGCCT 
TCATGTGGTT 
ACATACCTTC 
CTGTGTCTGA 
GTACAGAATG 



21 

I 

ATCCAGACTC 
GGCGGCGCA6 
GGAGTCGGGG 
CX306GGOGCC 
OCTTCCTGGG 
CCTATGCOGG 
GCX3TGTCGCA 
GCAGCACATT 
CAATCTTTGT 
AGAAGATGAG 
TAGTT6CCAC 
CAGTCAATGC 
TCTGCCTTCT 
CAACACCAAG 
GAGGCAAAAG 
TAACATTAGG 
CAAACAAACA 
TTATCTTCTT 
GAGTAATCAT 
TACATGTTTT 
ATACTTAAAA 
ATTOGTATAT 
TTCTTCATTA 
TCTTTCAATT 
ATAGCACTTG 
TGAATCTAAC 
AAATCAGAAC 
TTCCCACACA 
CCAATTQAGT 
TTTTAAGCTA 
TTAATTGTAT 
TQGTCTGTTT 
TCTCTCTGTA 
TTGAGATAAT 
ACTCTCATTC 
AGACACTGAA 
TCCTCTCTCT 
CAGTGCCTTC 
ATGTGGCTCA 
CATGTTTGTG 
CTATTTCACT 



31 
I 

CAGCGCCGCC 
C6AGCA66GC 
TTGCCCACCT 
OQAGCGAGTC 
ATOGATOGGC 
CGACAACATC 
6A6CACCGGG 
GCAAGCAACC 
GGCCACCGTT 
6ATGGCTGTC 
AGCATGGTAT 
CAGGTACGAA 
GGGAGGTGCC 
GCCCTATCCA 
GAGAAAATCA 
ACCTTAGAAT 
AAAAACC3CAT 
TCCTCAATAT 
ACTCAAATGG 
TCTATTAAAA 
TATCTCTAAA 
TTTCTTTTTC 
6CTTTGGGT6 
CTTCATGC6T 
CATC6TTATT 
ACATTTCATA 
TTTGOAG6CA 
ATCCCTGTAC 
AGCTGCATGC 
CTTATTCATA 
TGTTTTCCCA 
GTCTGAACAA 
GCTGTAAGCA 
GATACTTAAC 
TTTGAACATG 
GAAGTCACTG 
ACCAGTCTAT 
CTCTCTCTAC 
GT6CCTTCCT 
CTCTGTTCCA 
TGAGCAAGAT 



41 

I . 

COGGGOGCGG 
TCCCCGCCTT 
GCAAACTCTC 
AT06GCAA08 
GCCATC6TCA 
GTGACC3GCCC 
CAGATCCAGT 
CGTGCCTTGA 
GGCATGAAGT 
ATTGGGGGTG 
GGCAATA6AA 
TTTGGTCAGQ 
CTACTTT6CT 
AAACCTGCAC 
TGTTGAAACA 
TTTGGGTATT 
OTGTTAAAAT 
AG6AG0GAAG 
GGGAAGGGGT 
ATA6ACAGTA 
ATAGGTAAAT 
GTCCTTATAT 
CCTTTGCCAC 
GCCCTTTTCA 
AAGOCCTTAT 
GCCTACATTT 
AATCTTTCIQ 
TCTGACCCAT 
TGTTCCOCCA 
GTTTTATATC 
AGTGTAATTA 
AGTGCTAGAC 
AGTCACTTAA 
CAGTTAGAAG 
AACTATGCCT 
AACAAAACCT 
TTCCACTGAA 
CAGTCTATTT 
CTCTCTACCA 
TTTTAACAAC 
GATGTATGGA 



51 
I 

ACCCXaACCC 
AACTTCCTCC 
GGCCTTCTGC 
GGGGGCTGCA 
GCACTGCCCT 
AGGCCATGTA 
GCAAAQTCTT 
TGGTGGTTGG 
GTATGAAGTG 
GGATATTTCT 
TCX3TTCAAQA 
CTCTCTTCAC 
GTTCCTGTCC 
CTTCCAG06G 
AACOGAAAAT 
GTAATC7GAA 
ACrCAGTGCT 
ATTTTACCAT 
GCTCCTTAAA 
AAATACTATT 
GTATTTAATT 
ACATATQtrAA 
AAGACCTAGC 
TATACTTATT 
TTGTTrPGTG 
TAGTTTCTAA 
CATOACCAAA 
AGCACTCTTG 
GGTGTTGTAA 
CCX:CTAAACT 
TCATGOGTTT 
TTTCTGQAGT 
TCTTTCTACC 
AG6TAGTGTG 
ATGTAGTGTC 
ACACAC6TAC 
CAAAACCTAC 
CCACTGAACA 
GTCTATTTCC 
TGCTCTTACT 
AAGGGTGTTG 



60 
120 
180 
240 
300 
360 
420 
480 
540 
€00 
660 
720 
780 
840 
900 
960 
1020 



31 

1 I I I 11 

MEDYTKIBKI GEGTYGWyK GRHKTTGQW AMKRIRLBSE EBOVPSTAIR BISLLKELRH 
PHIVSLODVL MQDSRLYLIP EPLSMDLKKY IiDSIPPGQYM DSSLVKWTL WYRSPEVLLG 
SARYSTPVDI WSIGTIFAEL ATKKPLPHGD SEIDQLPRIP RALGTPNNEV WPEVBSLQDY 
KNTFPKNKPG SLASHVKNLD ENGLDXiLSKH LIYDPAKRZS GKHALNHFYP NDIiQKQIKXM 

Seq ID NOt 243 DMA sequence 

nucleic Acid Accession #t AFlOlOSl.l 

Coding sequence t 221-856 



60 
120 
180 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
' 660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 



281 



5 

10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



WO 02/086443 

GCACTGGTGT CTGGAGACCT QOATTTGAOT 
ASCAAGGCAT TTGGCTGCTG TAAGCTTATT 
CTQATCTTCC CACCTCACAG TGATGTTOTG 
GTGOrrTTGT AATTTQAAAA GTGCTATACT 
OGTTTTGGTG TTQCTTTTCA AATGTTTGAA 
GCCTTAACCA GTCTCTCAAG TGATGAGACA 
AAGATTCTQA GSIUU3TCTTA TCTTCT6CAG 
ACAGATGTAA TGGOAASM TAAAAGCCTA 
•rrrTGAATCA taataactca taaggtgcta 
TGTTAGCTGG CAGCTGAOGC TGCTAGGATA 
CTACACAAGG AAAGTCAGCC ACOGTGTCTT 
TQCCTTCCAA ACCT6AGAAT ATAT6CTTTT 
ATACATAGAT CTTCATGATC TGTOAGTGTA 
ACAAAAAAAT TTTATGGCCC AAAATGACCA 
TTTGATCTTT TTATATTCTT CTACCACACC 
TTATAATGG6 AATTT6TATA AAGCATTACT 
AAAA6QAAAA AAAAAAAAAA AAA 



Seq ID NO I 244 Protein eequence: 
Protein Accession #: AAD16433.1 



PCTAJS02/12476 



CTTGGTGCTA 
GCTTCATCTG 
GGGATCCAGT 
AAGGGAAA6A 
AATAAAAAAA 
GTGAAGTAAA 
TGAGTATGGC 
CGTGTTG6TA 
TCTGTTCAGT 
GTTAGTTTGG 
ATGAGGAATT 
G8AAGITAAA 
ATTCGATGT6 
ACGAAATT6T 
TGGAAACAGA 
CTTTTTCAAT 



TCAATCACCG 
TAAGOGGTGO 
GAGATAGAAT 
ATTGAGGAAT 
TGTTAAGAAA 
ATTGAGT6CA 
CCAATGCTTT 
AATCCAACAG 
GATGCXXrrCA 
AAATGOTACT 
GGACCTAATA 
ATTTAAATGG 
GATATCAGTT 
TACAATAGAA 
CCAATAGACA 
AAATT6TTTT 



TCTGTGTTTG 
TTTGTAATTC 
ACATGTAAGT 
TAACTGCATA 




1 11 21 31 41 51 

I I I I I I 

MAKAGIiQIiLG FILAFLGHIG AIVSTALPQW RIYSYAGDKI VTAQAHYEGL WMSCVSQSTG 
QZQCKVFDSIi LHL88TLQAT RALMWGILL GVIAIFVATV GMKCMKOiED DEVQKMRKAV 
ZGOAZFLLAO LAZLVATAWY GNRIVQEPYD PMTPVHARyS FGOALFTGHA AASXiCLLGQA 
LLCC8CPRXT TSYPTPRPYP KPAPSSGXDY V 



Seq ID NOt 245 DNA sequence 

Nucleic Acid Accession # : CAT cluster 



TTTTTTTTTT 
TTAATGGTTA 
AGCATGGTCC 
TTTTCTTCCT 
CAQTTGTTAT 
AGGTGGGGAO 
TTAATAGCXa 
GTCCTACGCC 



11 

1 

TTTTTTTTTT 
AATGCTGTTT 
CGAGAGTCTG 
6A6ATTTAGT 
GAAGAAT6CA 
GTOQCTCAAO 
CT6CACTTCA 
CAOGGAGTCT 



21 

J 

TTTTTCAAGG 
ACC3UU3TGAC 
ACAAACCTCA 
TTCTTCATCQ 
TATATTAGAA 
CCCAGGAATT 
6CCTGGGCAA 
CGCT6ATTGC 



31 



A6AGCACAA6 
CCAGAGGCA6 
GTTCAAATCC 
TTAACAATGA 
TGCCTGTAGT 
CAAAGCTGCA 
TOTAGXAAQA 
TA6CACAGCA 



41 

I 

6AACTTTATT 

OGTGGrrrAG 

TTCTTTTGTC 
GQATATTAAT 
CTCAGCTACT 
ATGCATTATG 
TCOCATCTCT 
GTCTQAGATC 



Seq ID NO I 246 DNA sequence 

Nucleic Acid Accession #i XM_058553. 

Godlng sequence i 897-1400 



1 

1 

AATTTTCA6A 
TAAATGTATT 
GTGAAACCAT 
CTGTTATCCA 
ATAGAGGGAA 
TGGGATGAGA 
GATCATQTTT 
GTTGAGT6TA 
ACTGGTAACC 
TTATTTCTGT 
TTGTOCAGGC 
GCCTTTGOCT 
TTTTTTTGTT 
TAGTCTTGCT 
CAGCCTCCCA 
AAGAAACTTA 
ACXATCAAAT 
CTGATQTTGC 
CTGAAATTAG 
TCAACXaAAC 
CTTGOGATGA 
GCACAACTCA 
ATAACCTGGC 
ACAATGGAAA 
GTTGCTTCTT 
AACTOCCTQT 
TTTAAT6CAA 



11 

I 

AGTTTCGTAT 
TAGTCTCAGT 
TTCTCTTTTA 
TAATATGQAC 
TGAGTATTAA 
GGA6GTQAAA 
AAGAAAAGTC 
TACTGTCTGT 
TGCCTATCTG 
6TTTATGTAT 
CAAGTGCAAT 
CCTGAGTAGC 
TGTTTGTTTG 
TTGTTGCCAG 
GAGTGCTAG6 
CAOOS ACTC C 
CAGGGCTTGC 
AAGCAAATTG 
TCATCATATC 
CAGGAGCCTT 
AOACTGGGAT 
CTACTCTGAC 
TTCAGGCATG 
TGCACAGTAA 
CTTCTACCAG 
GACTTTCCAA 
GAACCCTCAT 



21 
I 

G6G6ATGGTT 
GCTCAATAGA 
ATGTTTCACA 
AGTTCTTGAG 
TTGGAGAAGC 
CCTCACTAGA 
ATGAAAATGG 
CAAAGACTTC 
TATTTTTAAG 
AA6GGGTTTT 
6GCA0GAACC 
TGGGACTACA 

GCTAGTCTCA 
ATTACAQCAC 
CTGGACCCTG 
AGGTTTCCTT 
GCTACTTGTC 
TCAAGCTGTG 
AGACAAGAGA 
AAAGATTTGT 
AACAACAGCC 
CX3A6TTCCCA 
CTGAATACCT 
TGGGTTCrCA 
ACTGACAAGC 
ACTCAGAAOC 



31 
I 

TTATATAAAT 
AGAGATTTCT 
TTCCTGTTAC 
TCCTAACATT 
TTAAAGTATT 
AAAAGGGACA 
TGAACTAGTG 
CAGCATTTCC 
AACCCAGGA6 

' rmm 'rri'T 

TCATAGCTCC 
GGCATGAGCC 
GGGGGGGTTG 
AACTCCTGGC 
TTGGATTCAQ 
AGAAGCTATT 
ATCATCTTAT 
CCrrCAATGC 
ATGACAGAAG 
CTCTGGCTGA 
GGGAGCAGAC 
CTGCGAGCAA 
AATCTCTGCC 
ATCTCATCAA 
TTTTOCTCCT 
ACACTTTTTT 
TTCXAAATAA 



41 

I 

TCAGGTTTTT 
AATAQAAAAG 

AGATTTGTTC 
GAGAGGTTTT 
GCCACTTTAG 
ATGTTAGTGT 
TTTCCAAGCA 
AGGTCCTAGA 
GAAAGCTTTA 
AAAGACAGGA 
TGGACTTAAG 
CCCATGCCTG 
TTTTGTTTTT 
TTCAAGTGAT 
CTTCTTCATT 
GCAATGCCCC 
CAAGT6CAGA 
TCGCCACCaO 
TTGTATTGAO 
GAGCACTT66 
CAGCACCCCA 
CATAGTTACA 
GTATGTTCTG 
AT6CCAGAOC 
AATCTAATTA 
CCTCOOCGCT 
ACCTTTGATA 



51 
I 

AATQAC TTTC 
TGGTTTCAAC 
TTCACTTAGT 
ATGTTTCMA 
CAGGAGGCTA 
ATTACAGCIG 
QGCTGGGAG6 
AAACTGCA 



51 
i 

CCCAGAATAA 
GATTGAAACT 
TCTTGTGACT 
CCCTTAGTGC 
CACXGAAGAT 
GGCOCTTCCT 
TATTGGAAGG 
GAGGAACAA6 
TAATAGAACA 
TCTCACTCCA 
TGATCTGCCT 
GCTAA6TTTG 
TGTAQAQACQ 
CCTCCTGCCT 
TCCAACATGG 
TATGACAAAA 
AAGAATCATC 
GTTCCTCGAG 
CAAGATGTTO 
CAGTGCCCTC 
TTTGTCTGGQ 
GAACATAAGA 
CCATGGAAAA 
CTAGAAGACT 
TAGAATGGTA 
TGAATCCTCA 
CAGATTG 



Seq ID NO: 247 Protein sequence: 
Protein Acceseion #: XP_058553.i 

11 



51 



60 
120 
180 



60 
120 
180 
240 
300 
360 
420 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 



21 31 41 

i I I I I I 

MEETVTDSLD PEKLLQCFYD RNHQIRACRF PYHLZKCRRN BFDVASKLAT CPPNARHQVP 
RAEISHHISS Ca)DRSCIEQD WNQTRSIiRQ ETLABSTHQC PPCDEDHDKD LHEQTSTPFV 
HGTTHySCtQI SPASNIVTEH KNNLASGMRV PKSLPYVLPH KMMGNAQ 



60 
120 



282 



wo 02/086443 
Seg ID NO: 248 DKA sequence 
itucleic Acid Accession #: NM_003392 
Coding sequence: 7 5 6.. 1855 



PCT/US02/12476 



5 

10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



I 
I 

TTAAG6AAAT 
AACTGATTAT 
CGCGCTGGTC 
CATCCTCCAC 
TCCACICX3CC 
- GGCCOUSGTT 
GACGGACTGA 
GGAAGGA6GC 
TGGCTG6CGT 
TGG(?rGGATT 
GAOAAGGQCA 
TGTATGCTTG 
GTCCATTQGA 
CAA6TTCTTC 
CAATTCTTGG 
AG6AGCACAG 
CGACTT6TAT 
ATGCCABTAT 
TTTTGGCAGQ 
AGCAGGGGTG 
CTGCAGCOSC 
06ACAACATC 
6CX3CATCCAC 
C6A6GCCGGC 
GTCCGGCTCA 
TGATGCCCTG 
GTTG6TACA6 
CCCCAGCCCT 
CCTGT6CAAC 
GTAGGACCAG 
CTACX3TCAAG 
GCCACCCAGC 
T6GTTTTTGG 
TTTTTTCCTO 
GGCAATAATG 
TACAAGACTT 
CTGTGTGGGA 
TGCCATCATA 
TTCAGCTTCT 
A^^CAAAAC 
CATTTTCAAA 
GGTATATCAC 
AATAGCTCAT 
CTCTTATGTC 
AAGACCCCCA 
ATGAAATATC 
TACAT6AATC 
GCACTQCACC 
ACACTGABCC 
GCAGCTCCAC 
TG8AAAACAG 
TAQTTTTCTA 
TATCACTGTT 
GTGTJ^XTTTA 
GGTTTAATGG 
ATATATAAAT 
CTCTQGQQTT 
ATTCCAAAA6 
6CA0GACX?AA 
TGTTGGGTT6 
TTCTGTTCAC 
ATTCAAAACT 
AACCCATQCC 
CACTACATAG 
CAGGCCATCA 
ACATCTTTTC 
CTCTTAATTT 
ATAAT6ATAT 
AAAGCACTAA 
TACTTTTTTT 
GTTGAGTTTA 
CATTCAGATA 
GTCTTGCGTG 
AATGGAAGAT 



11 
I 

CCGGGCTGCT 
GAAACATACG 
CCCGGGGCCT 
CCCCXX306CT 
TCCOGTGCrC 
GGG6AGAGGA 
GCACAGCACC 
AQCGCCTGGC 
GCCCOSCGCA 
AATTTGGAAA 
GTCAATCAAC 
AAAATTATCT 
ATATTAAGCC 
CTAGTGGCTT 
TG6TCGCTAG 
CCTCTCTGCA 
CAGGACCACA 
CAATTCCGAC 
GTGATGCAGA 
GTGAACGCCA 
GCC6CX3CGCC 
GACTATGGCT 
0CCAAG6GCT 
C6CA6GAGGG 
TGTAGCCTGA 
AAGGAGAAGT 
GTCAACAGCC 
6ACTACT006 
AAGAGGTCGG 
TTCAAGACXXf 
TGCAAGAAGT 
ACTCA6CCCC 
TTTTTAGAAA 
TTACCATCTA 
GGGGTGGGAA 
CTTTTGGATA 
CATGGTACAC 
TGGGATGGOT 
ATGACCAAAA 
AAAACAAAAC 
ATGATAATTT 
ATGTCTCATT 
GAAATTTGGG 
AAGATGTTGA 
ATGATTCTGG 
CTOTATTTTC 
CCATTCACAG 
ASABCA6ACA 
CTCTCTGATT 
TGGGTCCCCT 
TTCACTACTT 
TGTTTTAATG 
ATGATCCTGT 
AAACTGTTCC 
TQCCTGATAT 
ATAAATATAA 
ATCrCTCTGT 
TTTTTTGAGT 
GCAACCTOGT 
AAGATATCTT 
TTT6TGGA6A 
CAGAAGCATC 
TATTAGAAAT 
ATAGCTTTTT 
AATATGTTCT 
ATACCCCCCC 
ATTGCATAAT 
TCACATCCCC 
TTAGTTTAAA 
ATTTQCTAAA 
ACAATCCTAG 
TTATGTATAT 
ATTTGTATAT 
AGAAIATAAA 



21 
I 

CTTCCCCATC 
AT6TTAATTC 
C6CCCCCCAC 
GGCCACGCCG 
CTCTOQ O CCA 
GGGAQQGTGG 
AACTAGAGA6 
ACCAGGGCTT 
CAGGATCCCA 
AASAAACT6C 
AGTAAACTTA 
GAGAGGGAAT 
CAGGAQTTGC 
TGQCCATATT 
GTATGAATAA 
GCCAACTGGC 
TGCAGTACAT 
ATCGACGGTG 
TAGGCAGCCG 
TGAGCOGGGC 
CCAAGGACCT 
AC0GCTTT6C 
CCTACX3AGAG 
T6TACAACCT 
AGACATGCTG 
ACGACAGC6C 
GCTTCAACTC 
TGOQCAATQA 
AGG6CATGGA 
T6CAGAC6GA 
GCACGGAGAT 
GCTC0CA66A 
TATTTTTTAT 
AGAACrCTGT 
CCACGAAAAA 
GTATAGAATG 
ATCCAGAAGG 
AGGTTCChGT 
TGAGTT6TAA 
AAACCTCCCT 
ACAATGGAAG 
CTCCTCAAAT 
CAGCAGGGA6 
TTTGAAGCTG 
ACACTAGATT 
TTAOGGATAC 
GTTCCTCA6C 
ACCTATTTGA 
CCTCCX3TGTT 
TTGGTTGTAG 
AOOQATTTTT 
ACAGAACTTG 
GTTTAGATTA 
CAGTGTACTT 
CTCAAAGTCT 
ATATATCTCA 
CTA6AGCATT 
CTTGAGCTT6 
TTCTGAGGAA 
TTTTTCTTTT 
GGGCATTACr 
AQCAATGTTT 
GACAGTACTT 

TATCTCAGAC 
TTAGGAGGTT 
GATATCCACA 
TCAGTTGCAG 
AT6TCACTTT 
TCAGATTGTT 
CTTTTAAAAG 
CTTCTAGCCT 
TTCACTGGTT 
ATAAAACX3TT 



31 
I 

TGGAA6TG6C 
GGAGCTGCAT 
CCCCTGCCCT 
CCTCCTTGGC 
TGQAATTAAT 
CCXtCAGOGQG 
GGGTCAG6GG 
TQACTCAACA 
GCQAAAATCA 
CTATATCTTG 
AQAGACCCCC 
AAACATCTTT 
TTTGGGGATG 
TTTCTCCTTC 
CCCTGTTCAS 
AGGACTTTCT 
CGGAGAAGGC 
GAACTGCA6C 
C6AGACGGCC 
GTGCCGCGAG 
GCOSCGGGAC 
CAA6GAGTTC 
TGCTCOCATC 
GOCTGATGTQ 
GCTGCAGCTG 
GGCGGCCATG 
GCXXACCACA 
GAGChCGGGC 
TGGCTGCGAG 
GCGCTGCCAC 
CGTGGACCAG 
CCCGCTTATT 
TTTTCCCCAA 
6GTTTATTAT 
TATTTATTTT 
AAGGGGGAAA 
TAAA6AAATA 
TGAAAGAGGG 
ATTCTCTGGT 
TCCCCAGCAG 
GACAAGAATG 
ATTCCATTTG 
GAAAGTCCCC 
TTATAAGAAT 
TTTTGTTTGG 
TTGGTTAOTA 
CCAAGOUICA 
GGAAAAACAG 
GTGATGTGAT 
GACA66AAAT 
6TTTCCTAAA 
GCTAAT6GAA 
TCCACrCATG 
GAACAGTTGC 
TTTGTACATA 
TTGCA GCCAG 
GTTQTGCTTC 
G6CTGTGGCC 
6AAGCTTGAG 
CTGCCTCACC 
TOTTOGTTAT 
CT CWT T CVr 
ATTAATTGAG 
TTTTTTTTAA 
TTAC3GTTGTT 
GGGCTTTCAT 
TCAGCCAACT 
IX^VATTGTGA 
TTTGGTTTTT 
CCTTTTTAGT 
AAACTATTTA 
TTATTCTGTA 
TAAAAAACAA 
ACTT6TAAAA 



41 
I 

TTTCOCCACA 
TTCCCAGCTG 
TCCCTCCCGC 
AGCCTCTG6C 
TCTOGCTCCA 
TTCCTGAGTQ 
GTGCGGGACT 
GAATTGAGAC 
GATTTCCTQO 
CCATCAAAAA 
GATGCTOCCr 
TCCTTCTTCC 
GCT6GAAGTQ 
6CCCAGGTTG 
ATGTCAflAAO 
CAAGGACAOA 
GCGAAGACAG 
ACTGTGGATA 
TTCACATACG 
GGOGAGCTGT 
TGGCTCTGGG 
GTOGAOSCCX: 
CTCATGAACC 
GCCTGCAAGT 
GCAGACTTCC 
CGGCTCAACA 
CAA6ACCT66 
TG6CTQ6GCA 
CTCAT6T6CT 
TGCAAQTTCC 
TTTGTGTGCA 
TATA6AAAGT 
GAATT6CAAC 
TAATATTATA 
GTGGATCTTT 
TAACACATAC 
CATTTTCTTT 
TGGTAOAAAT 
6CAAGATAAA 
GGCTGCTAGC 
TCATATTCTC 
CAGACAGACC 
AGAAATTAAA 
TGGGATTCCA 
GGAGGTTGGC 
AATTATAATA 
AG6TAATTQC 
TGAAATCCAC 
GCTGGCCACG 
GAAACATTAG 
ACTTTTATTT 
TTCACAGAGG 
CTTCTCCTAT 
ATTTATAAGG 
ACATATATAT 
TGATTTAQAT 
ACIGCA6TCC 
COGCTGTGAT 
TTCTGACrCA 
CCTTTGTCTC 
AGACATGGAC 
AGTTCATTCT 
TCCCTAAGGA 
TAAGGACACX: 
TTAAAAQTTT 
ATCACCTCAG 
GTGGCTCTTT 
GCAAAAGATC 
ATTATACAAA 
GACTCATGTT 
ATGTAAAATA 
CTTTTAATGT 
ACATCX3AAAG 
AAAAAAAA 



51 
I 

T06GCTCGTA 
GGCACTCTCG 
QTCCTGCCCC 
GGGAGCGCGC 
CTTGTTGCTC 
AATTACCCAG 
CGAGCX3A6CA 
AC6TTTGTAA 
TGA6GTTGCG 
ACTCAOGGAG 
TGGTTTAACT 
CTCTCCAGAA 
CAATGTCTTC 
TAATTGAAGC 
TATATATTAT 
AGAAACTGTG 
GCATCAAAGA 
ACACCTCTGT 
CCGTGAGCGC 
CCACCTGCGG 
GCGGCTGCGG 
GOSAGCGGGA 
TGCACAACAA 
GCCATGGGGT 
GCAAGGTGGG 
GCCGGGGCAA 
TCTACATCX3A 
GQCAGGQCaG 
GCGGC0GT6G 
ACTGGTGCTG 
AGTAGTGGGT 
ACAGTGATTC 
OGGAACCATT 
ATTATTATTT 
GAAAAGGTAA 
CCTAACTTAG 
TTCTCAAATA 
CTATTCACAA 
AGGTCTIG66 
TTGCTTTCTG 
AAGGAAAAAA 
GTCATATTCT 
AAATTTAAAA 
GATTTGTAAA 
TTGAACATAA 
GTAGAAAXAA 
GT6CCATTCA 
CTTCCTCTTC 
TTTCCAAAOG 
GAGCTCTGCT 
T6AGGAGCAG 
TGTTGCAGOG 
TGTACTGCAG 
GGGGAAATGT 
ATATATACAT 
TTACAGCTTA 
AGTTGGGATT 
CATACCCTGA 
CTGAAATGCG 
CAACCTCCAT 
GTTAAGA6AT 
GCA6AAT6GA 
ATATTCAGCC 
TCTTTCCAAA 
OOAAAGATAC 
CCAACTGTGG 
AATTTATTGC 
TTGAAAGCAA 
AACCATGAAG 
TATGAAGAGA 
TTCTACATGT 
ACATATTTCT 
<3CTTATTCCA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4260 
4320 
4380 



Seq XD NO: 249 Protein sequence i 
Protein Accession #: NP 003383 



21 

! 



41 
I 



51 
I 



283 



wo 02/086443 

MAGSAMSSKP FLVALAIFFS FAQWIEAMS WWSLGMNNPV QMSEVYIIGA 
SOGOKKZiCBL YQDHKQYKffi GMCTGZKEOQ YQFRBRRHNC STVDNTSVF6 
AFTYAVSAAO WMAMSRACR EQBLSTCQCS RAARPKDLPR DWLHGGOGDN 
PVDARERERI HAKQSYESAR lUINLHNNEA 6RRTVYNLAD VACKCHGVSG 
LADFRKVGDA LKBKYDSAAA MRUJSRGKLV QVNSRPMSPT T QDLVY IDPS 
QSLQTQGRLC NKTSBC34DGC ELMC06R6YD QFKTVQTERC HCKFHWCCW 
QFVCK 

Seq ID NO: 250 DHA sequence 
Nucleic Acid Accession #: NM_014058 
Coding sequence: 5 6.. 13 24 

1 11 21 31 41 51 

1 I I I I I 

TGACTTGGAT GTAQACCTCG ACCTTCACAG OACTCTTCAT TGCTGOTTGG CAATGArTGTA 60 

TCGGCCAGAT GTQOTOAGOG CTAGGAAAAG AGTTTGTTGG GAACXXTTCGG TTATOGGCCT 120 

OGTCATCTTC ATATCCCTGA TTGTCCTGGC AGTGTGCATT GGACTCACTG TTCATTATGT 180 

QAOATATAAT CAAAAGAAGA CCTACAATTA CTATAGCACA TTQTCATTTA CAACT6ACAA 240 

ACTATATGCT GAjSTTTGGCA GAGAGGCTTC TAACAATTTT ACAGAAAT6A GCCAG AGACT 300 

TGAATCAATG GTGAAAAATG CATTTTATAA ATCTCCATTA AQG8AA6AAT TTOTC AAOT C 360 

TCAGGTTATC AAGTTCAGTC AACAGAAGCA TGOAGTGTTG GCTCATATGC TGTTGATTTG 420 

TAGATTTCAC TCTACTGAGG ATCCTGAAAC TGTAGATAAA ATTGTTCAAC TTGTTTTACA 480 

TGAAAA6CTG CAAGATGCTG TAGGACCCCC TAAAGTAGAT CCTCACTCAG TTAAAATTAA 540 

AAAAATCAAC AAGACAGAAA CAGACAGCTA TCTAAACCAT TGCTGCGQAA CAC6AAGAAG 600 

TAAAACTCTA GGTCAGAGTC TCAGGATCX3T TGQTGGGACA GAAGTAGAA6 AGGGTGAATG 660 

GCCCTGGCAG GCTAGCCTGC AGTGGGATGG GAOTCATC6C TGTGGAGCAA CCTTAATTAA 720 

T6CCACATCG CTTGTGAGTG CTGCTCACTG TTTTACAACA TATAAGAACC CTGCCAGATG 780 

GACTGCTTCC TTTGGAGTAA CAATAAAACC TTOGAAAATG AAACGGGGTC TCCX36AQAAT 840 

AATTGTCCAT GAAAAATACA AACACXXATC ACATGACTAT GATATT TCTC TTGCAQA6CT 900 

TTCTAGCCCT GTTCCCTACA CAAATGCAGT . ACATAGA6TT T6TCTCCCTG ATGCATCCTA 960 

TGAGTTTCAA CCAGGT6ATG TGATGTTTGT 6ACAGGATTT GGAGCACT6A AAAATOATGG 1020 

TTACAGTCAA AATCATCTTC GACAAGCACA GGTGACTCTC ATAGACOCTA CAACTTGCAA 1080 

TGAACCTCAA 6CTTACAATG ACGCCATAAC TCCTAGAATG TTATGTGCTO QCTCCTTAGA 1140 

AGGAAAAACA GATGCATGCC AGGGTGACTC TGGAGGACCA CTQOTTAGTT CAGATGCTAG 1200 

AOATATCTGQ TAOCTT G CTG GAATAOTGAG CTGGGGAGAT GAATGTGGQA AACXX3UICAA 1260 

GCCTGGTGTT TATACTAGAG rPAOGGCCTT GOGG QACTGQ ATTACTTCAA AAACT6GTAT 1320 

CTAA6AGAGA AAAGCCTCAT GGAACAGATA ACATTTTTTT TTQTTTTTTG GGTGT6GA60 1380 

CCATTTTTAG AGATACAGAA TTGGAGAAGA CTTGCAAAAC AGCTAGATTT GACIGATCTC 1440 
AATAAACTGT rTGCTTGATG CAAAAAAAAA A 



Seq ID NO: 251 Protein sequence: 
Protein Accession #i NP_054777 

1 11 21 31 41 51 

I I i I i I 

MYRPDWRAR KRVCWEPWVI GLVIPISLIV LAVCIGLTVH yVRYNQKKrY NVYSTLSFTT 60 

DKLYAEFGRE ASNNFTEMSQ RLESMVKNAP YKSPLREEPV KSQVIKFSQQ KHGVLAHMUIi 120 

ICRPHSTEDP ETVDKIVQLV liHEKXiQQAVG PPKVDPHSVK XKKINKTETD SYIiNHCCGTR 180 

RSKTLGQSLR IVGGTEVEBG EWPffQASLQW DGSHROGATL XNATWLVSAA HCFTTYKNPA 240 

RffTASFGVTI KPSiEMKRQIiR RUVHEKntH PSHDTOISIiA ELSSEVPrm AVHRVCLPDA 300 

SYBFQPGDVN FVTGFQAXjRN DGYSQUBUtQ AQVTLIDATT CNEPQAYNDA ITPRMLCAGS 360 

LEGKTDAOQQ DSGGPLVSSD AKDZWYIAGZ VSNGDECAKP MKPGVyTBVT ALRDHITSKT 420 
GX 

Seq ID NO: 252 DNA sequence 

Nucleic Acid Accession ft: 1IM_003504.2 

Coding sequence: 71-1771 

I 11 21 31 41 51 

I I 1 I I I 

GGCaCGAGGC CTCGTGCCGC CGGGCTCTTG GTACCTCAGC G06AGCGCXA GGC6TCCGGC 60 

OGCCGTGGCr ATGTTCGTGT CCGATTTCCG CAAAGAGTTC TACQAGOTGO TGCAGAGCCA 120 

GAGGGTCCTT CTCTTCGTGG CCTCGQACGT GQATGCTCTG TGTGCGTGCA AGATOCTTCA 180 

GGCCTTGTTC CAGTGTQACC AC6TGCAATA TACGCTGGTT CCAGTTTCTG GGTGGCAAGA 240 

ACTT6AAACT GCATTTCTTG AGCATAAAGA ACAGTTTCAT TATTTTATTC TCAT AAACTG 300 

TGGAGCTAAT GTAGACCTAT TGGATATTCT TCAACCTGAT GAAGACACTA TATTCTTTGT 360 

GTGTGACACC CATAGGCCAG TCAATGTCGT CAATGTTATAC AACSGATACCC AGATCAAATT 420 

ACTCATTAAA CAAGATGATG ACCTTGAAGT TCC060CTAT GAAQACATCT TC3«3GGATGA 480 

AGAGQAGGAT GAAGAGCATT CAGGAAATOA CAGTGATGGG TCAGAGCCTT CTGAGAAGOG 540 

CACA06GTTA GAAGAGGAGA TAGTGGAGCA AACCATGCGG AGGAGGCAGC GGCGAGAGTG 600 

GGAGGCCOGG AGAAGAGACA TOCTCTTTGA CTAOQAGCAG TATGAATATC ATGG6ACATC 660 

GTCAGCCATG 6TGATGTTTQ AGCTGOCTTO QATOCTGTCC AAG6ACCTGA ATGACATGCT 720 

GTGGTGGGCC ATCGTTGGAC TAACAGACCA 6TGGGTGCAA GAGAA6ATCA CTCAAATGAA 780 

ATACGTGACT GAT6TTGGTG TCCTGCAGCG CCACGTTTCC OGCCACAACC ACOGGAACGA 840 

GGATGAGQAQ AACACACTCT CC30TGGACTG CACACGGATC TCCTTTGAGT ATGACCTCOG 900 

CCTGGTGCTC TAGCAGCACT GGTCCCTCCA TGACA6CCTG TGCAACACCA GCTATACOGC 960 

AGCCAGGTTC AAGCTGIGGT CTSIGCATGG ACAGAAOOGG CT0CA06AGT TCCTTGCAGA 1020 

CATGGGTCTT CCCXTTGAAGC AG6TGAAGCA GAAGTTCCAG 6CCATG6ACA TCTCCIT6AA 1080 

GGAGAATTTG CGGGAAATGA TTGAAGAGTC TGCAAATAAA TTTGGGATGA AGGACATGCG 1140 

CGTQCAGACT TTCAGCaiTC ATTTTGGGTT CAAQCACAAG TTTCTQQCCA GCGAOGTGGT 1200 

CTTTGCCACC ATQTCTTTGA TGGAGAGCCC OGAGAAGGAT GGCTCAGQGA CAGATCACTT 1260 

CATCCA6GCT C T 6GACAG0C TCTCCAGGAG TAACCTGQAC AA6CTGTACC ATGGCCTGGA 1320 

ACTCGCCAAG AA6CA6CTGC GAGCCACCCA GOMSACCATT GCCAGCTGCC TTTGCaCCAA 1380 

CCTOGTCATC TCCCAGGGGC CTTTCCTGTA CTGCTCTCTC ATGGAGGGCA CTCCAGATGT 1440 

CATGCTGTTC TCTAGGCCGG CATCCCTAAO CCTGCTCAGC AAACACX:TGC TCAAGTCCTT 1500 

TGTGTGTTCG ACAAAGAACC GGCX3CTGCAA ACTGCTGCCC CTGGTGATGG CTGCCCCCCT 1560 



QPLCSQLAGL 60 

RVMQIGSRBT 120 

IDYGYRFAKE 180 

SCSLKTCWLQ 240 

PDYCVRNEST 300 

KCKKCTErVD 360 
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OAGCATGGAG CftTGGCACAG TOACCGTGGT GS6CATCCCC CCAGAOACGS 
CAOOAASAAC TTTTTTOGGH GOGOGTTTGA 6AAG6CA606 GAAA(3CACC3k 
8CT6CACAAC CATTTTQACC TCTCftCiTAAT TQAGCTQAAA GCTGAGQATC 
TCTGOACtSCA CTTATTTCCC TCCTOTCCTA GGAATTTOAT TCTTCCAOAA 
ATTTATSTAA CTG5CTTTCR TTTAGATTOT AAGTTATGOA CATOATTTSA 
CCATTTTTTa TEAAATAAAA TGdTATTTT AOOOICOOTC CCOAAAAAA 
AAAAAAAAAA AA 

seq ID HOi 253 protein stquencei 
Protein Accession t: HP 003495.1 



1 ■ 
I 

MFVSDFRKEP 
AFLEHKEQFH 
QDDDLEVFAY 
RRDILFDYEQ 
DVGVLQSBVS 
KLWSVHGQKR 
FSIHFGFKHK 
KQLRATQQTI 
TKNRRCKLIjP 
KFDIiSVIELK 



11 
I 

YEWQSQRVL 
YFILINCGAN 



21 



41 



LFVASDVDAL CACKILQAliP QCDHVQYTLV 
VDLLDILQPD EDTIPPVCDT HRPVNWNVY 



YEYBGTSSAH 



LQBFLADM6L 
PIASDWFAT 
ASCLCTNLVI 
LVMAftPbSME 
AEDRSKFLDA 



VMFELAWMLS 
NTLSVDCTRI 
PtiKQVKQKFQ 
NSLMESPEKD 
SQ6PFLYCSL 
BQTVTWOXP 
LXSLLS 



KDLMDMLNNA 
SFEYDLRLVL 
AMDISLKENL 
GS6TDHFIQA 
ME6TFDVMLF 
PETDSSDRKN 



rVOLTDQWVQ 
YQBHSLHDSL 
REMIEESANK 
LOSIiSRSNLD 
SRPA8LSLLS 
FF6RAFEKAA 



ACAGCT06QA 
6CTCCCGGAT 
6GAGCAAGTT 
TGACCTTCTT 
GATGTAGAAG 
AAAAAAAAAA 



51 
I 

PVSGWQEIiGT 
NDTQIKLLIK 
RRQBREfifEAR 
DKITOHKYVT 
CNTSYTAARF 
FGMKDMRVQT 
KLYHGIiELAK 
KHLLKSFVCS 
ESTSSSHUHN 



Seq ID NO I 254 DMA sequence 
Nucleic Acid Accession #: NH_022337 
Coding sequence t 48.. 683 ~ 



1 

I 

GGCTGCGCTT 
ACAAG6AGCA 
TCATCAAGGG 
ACTT06O8CT 
ATATCGCAGG 
6TGCATTTAT 
AAAATGATTT 
TGGOCAACAA 
AGTTCTGCAA 
ACATTGATGA 
TG6AGTCTAT 
GCTCTGGCTG 
TTGTTCCACA 
CACATGTGGC 
GTTCTTTCTA 
TCTGTTACAA 
TTATTTGCTT 
AACTAGCTGT 
AATATATTCT 
GACCTCCATT 
ACAGGTQTGC 
AACTQAATAT 
CTCAAGCTGT 
GCAAGTGAAC 



11 

I 

CCCTGGTCAG 
CCTGTACAAG 
CTACGTGCAC 
CAAG6T0CTC 
TCAAGAAAOA 
TGTCTTCGAT 
QQACTCCAAG 
ATGTGACCAG 
Q8AGCA0GGT 
AGCCTCCAGA 
TGAGCCGGAC 
TGCCAAATCC 
AATTGTGCCT 
AAGCCAAAQA 
TGCTTTCCTC 
ACTTCTGTCA 
CTTTTAATCA 
CAAGTCAAGG 
CTGATGGCCT 
CTCGGCAGAC 
TATATTGTCC 
TSTATGAAAA 
GGGGCTCCTC 
AATAAAACAT 



21 
I 

GCACGGCACG 
TT6CTGGTGA 
CAGAACTTCT 
CACTGGOACC 
TTTQOAAACA 
GTCACCAGGC 
TTAA6TCTCC 
GG6AAGGATG 
TTCGTAGGAT 
TGCCTGGTGA 
GTOGTGAAGC 
TAGTAGGCAC 
CTATTTTTAC 
TCTATGCCTC 
ACCATCATCA 
TGTAGCTGAC 
GCAAAGGCCT 
ACTGGCTTTC 
GACAGGCCTA 
CTAAGAGTTG 
TTGTCCTAAC 
CSUATGGCTC 
TATACATQCr 
TAAAAGATAA 



31 

I 

TCTGGCCX3GC 
TTGGCQACCT 
CCTCGCACTA 
GGGAGACXOT 
TGAGGAG6GT 
CAGCCACATT 
CTAATGGCAA 
TGCTCAT6AA 
G6TTTQAAAC 
AACACATACT 
CCCATCTCAC 
CTTTGCTGGT 
CATTTT6GGT 
TQTTTTTTCA 
CAGTGTTTAC 
CAAAATCXTG 
CAAGTCTTAA 
ACCTT6CCCT 
TTAAGTAGAT 
CCTCT6AGTT 
T6TCACTT6C 
CATATOTGCC 
ATAaVTCTAA 
AA 



41 
I 

CGCCAGGATG 
GGGCGTGGGG 
CGGGGCCACA 
GGIQOGCCTG 
CTATTACCGA 
TGAA6CAQT6 
ACCX3GTTTCA 
CAATGGCCTC 
ATCMSCAAAG 
TGCAAATGA6 
ATCAACCAAG 
GTCTGGTAGG 
AAAC6TCAGG 
ATG ftGAGA QA 
AAACTTTT6A 
CAGGGCCACA 
AATAAAA6GG 
GGTGTCTTTT 
GTGATATTTT 
AGCTCTTTGG 
CAT6GCCTGA 
TTTCTGTTAG 
TATATATTAT 



51 

I 

CAGGCCCCGC 
AAGACCAGTA 
ATOSGCXSTGG 
GAGCTCTGGQ 
6AA6CTATGG 
GCAAAGTGGA 
GTGGTTTTGT 
AAGATGGACC 
GAAAATATAA 
T6TGACCTAA 
GTTGCCAGCT 
AATOACCTCA 
ATAGATATAC 
AATAGCAAAT 
AAATATTTAO 
GTCGGCACTG 
GAGAAGAACA 
TCCA6ATTTC 
CTTCCAAGAT 
AATCGTGAAC 
ATGTTGGCTT 
CTCTCTTTGA 
ATATATTTTT 



Seq ID NO: 255 Protein sequence; 
Protein Accession Ut NP 071732 



11 



21 31 41 51 

I I I I I i 

MQAFKKEBLY KZiLVIGDLGV GRTSIIKRYV BQNFSSHnUV TIGVDFALKV LENDPETWR 
LQLMDIAGQE RFSNHTRVYY REANGAFIVF DVTRPATFEA VAKHKNDLDS KLSLPNGKFV 
SWLLANKCD QGXDVLMNHG LKMDQFCKBH GFVGWFETSA KBNIKIDEA5 RCLVKHILAN 
ECaSLMBSISP DWKFHLTST KVASC8GCAK S 

Seq ID NO: 256 DNA sequence 
Nucleic Acid Accession # : NM_016321 
Coding sequence: 25.. 1464 ' 



1 

1 

GGAACCGCCC 
CCGCTGGCGG 
GGTGTTCGTG 
GAACTTGAGC 
06TGATGGTC 
CGCCGTGGGC 
GGGCTGGTTC 
G6CTGACTTC 
CCCCATTCAG 
CATTCTCCTT 
TGGCGCCTAC 
CAAGGAGAGA 
CCTQTGGATG 
OCGAGCOGCC 



11 

I 

GCTGCCAGCC 
CTGCCGCTCA 
CQCTACGACT 
GACATG6AGA 
TTCGTGGGCT 
TTCAACTTCC 
CACTTCTTAC 
TQCGIGGCCT 
CTGCTCATCA 
AACCTGCTAA 
TTTGGGCTCA 
CAGAATTCTG 
TACTGGCCCA 
ATCAACACCT 



21 

! 

GGGCCAGGCA 
CCTGCCT6CT 
TCGAGGCCQA 
A06AATTCTA 
TCGGCTTCCT 
TGTTGQCAGC 
AAGACCGCTA 
CT6TCTQCQT 
TGACTTTCTT 
AGGTGAAGGA 
CAGTGACCCG 
TGTACCAGTC 
6CTTCAACTC 
ACTGCTCCT T 



31 
I 

CCCCTGCAGC 
CCTGCAG6TG 
OOCCCACTGG 
CTATCGCTAC 
CATGACTTTC 
CTTCGGCATC 
CATCGTC6TG' 
GGCCTTTGGG 
CCAAGTGACC 
TGCAG6AGGC 
GATCCTCTAC 
GGACCTCTTT 
A6CCATATCC 
GOCAGCCTGC 



41 
I 

ATQGCCTGGA 
ATTATGGTGA 
TGOTCAGAaA 
CCAAGCTTCC 
CTGCAGCGCT 
CAGTGGGCGC 
GGC6TGGAGA 
GCAOTTCTGO 
CTCTTCGCTG 
TCCATGACCA 
CGACGCAACC 
QCCATGATTG 
TAOCATGGGG 
GTQCTTACCr 



51 

1 

ACACCAACCT 
TTCTCTTCG6 
GGACGCACAA 
AGGACGTGCA 
ACGGCTTCAG 
TGCTCAT6CA 
ACCTCATCAA 
GTAAAGTCA6 
TGAATGA6TT 
TCCACACATT 
TAGAGCAGAG 
GCACCCTCTT 
ACAGCCAGCA 
CaOTGGCAAT 



1620 
1680 
1740 
1800 
1660 
1920 
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60 
120 
180 
240 
300 
360 
420 
480 
540 



60 
120 
180 
240 
300 
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420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
X020 
1080 
1140 
1200 
1260 
1320 
1380 



60 
120 
160 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
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ATCCAGIOCC CTGCACAAGA AGGGCAAGCT GGACATGGfTG CS^CATCCAGA ATGCCACGCT 900 

CGCRGGAGGG CPTGGCCGTGG GTACCGCTGC TGAfiATGATO CTCATGCCTT ACGGTGCCCT 960 

CATCATOGGC TTCXSTCTGCG GCATCATCTC CACCCTGGGT TTTGTATACC TC5A0CCCATT 1020 

CCTGGAGTCC CX^GCTGCACA TCCAGGACAC ATGTGGCATT AACAATCTGC ATGGCATTCC 1080 

TGGCATCATA GGCGGCATCG TGGGTGCT6T GACAGC6GCC TCa3CC3M5C3C TTGAAOTCTA 1140 

TCGAAAAGAA GGGCTTGTCC ATTCCTTTGA CTTTCAAGGT TTCAACGGGG ACTOOACCGC 1200 

AAGAACAOkG GGAAAOTTCC AGATTTATGG TCTCTTGGTG ACCCTGGCCA TGGCCCTGAT 1260 

• GGOTOGCATC ATTGTGGGGC TCATTTTGftQ ATTACJCATTC TGGGQACAAC CTTCAGATGA 1320 

6AACTGCTTT GAGQATGOGG TCTACTGGGA GATGCCTGAA GGGAACAGCA CTGTCTACAT 1380 

CCCIGAGGAC CCCACCTTCA AQCCCTCAGG ACCCTCAGTA CCCTCAGTAC CCATGQT6TC 1440 

COCACTACCC ATGGCTTCCT C3GGTACCCTT GGTACCCTA6 GCTCCCAGGG CAGOTGAGGA 1500 

QCAOaCTCXa CAGACTSTCC TGGGGCCCAG AGGAGCTOOT GCTGACCTAG CTAG66ATGC 1S60 

AAGAOTGAQC AAGCAGCACC CCCACCTGCT GGCTTGGCCT CAAGGTGCCT CCACCCCTGC 1620 

CCTCCCCTTC ATCX:CAGGGG GTCTGMCTGA GAATGQAORA GGAGAAGCTA CAAAGTGGGC 1680 

ATCCAAGCCG GGTTCTGGCT GCAGAAGTTC TGCCTCTQCC TOGGGTCTTG GCCACATTGG 1740 

AGAAAAACAG GCTCAAAGTG GGGCTGGGAC CTGGTGGGTO AACCTGAGCT CTCCCAGGAG IBOO 

ACAACTTAGC TGCCAGTCAC CACCTATGAG GCTCTTCTAC CCCGT6CCT0 CACCTCGSCC 1B60 

AGCATCTCCT ATGCTCCCT6 GOTCCCCCAG ACCTCTCTGT GTTGTGTGCO TGGCA6CCTC 1920 
CAGGAATAAA CATTCTTGTT GTCCTTrGTA AAAAAAAAAA AAAAAAAA 

Seq XD NO: 257 Protein sequence: 
Protein Accession NP_057405 

1 • 11 21 31 41 51 

MAWNTNLRWR LPLTCLLLQV IMVILPGVFV RYDPERDABW WSBRTHKNLS DMBMBPYYRY 60 

PSFQDVHVMV FVGFGFLMTF LQRYGPSAVG FNFLLAAFGI QWALtMQGWF HFIiQDRYIW 120 

GVENIilNADP CVASVCVAFG AVLGKVSPIQ LLIMTFFQVT LFAVNEFIIiL NLLKVKDAGG 180 

SMTIHTFGAY FGLTVTRILY RRIJLBQSKER QNSVYQSDLP AMIGTLFI.WM YWPSPNSAIS 240 

YHGDSQHRAA INTYCSLAAC VLTSVAISSA LHKKGKLDMV HIQSATLAGG VAVGTAAEMM 300 

LMPYGALIIG PVCGIISTLG FVYLTPPLES RLHIQDTOSX MHiHGIPGII 6GIVGAVTAA 360 

SASLEVYGKB GLVHSFDPQG PMGDWTARTQ GKPQIYGLLV TLAMALMQOl IVGLIliSLPP 420 
HGQPSDBMCF EDAVYWBMPE GMSTVYIPBD PTPKPSGPSV PSVPMVSPIiP HASSVPIiVP 

Seq ID NOt 258 DMA sequence 

Nucleic Acid Accession #: NH_002358.2 

Coding sequence: 75.. 692 

1 11 21 31 41 51 

GGGAAGTGCT GTTGGAGCCG CTGTGGTTGC TGTCC6CGGA GTGGAAGCX5C GTGCTTTTGT 60 

TTGTGTCCCT GGCCATGQCG CTGCAGCTCT CCOjGGAGCA GGGAATCACC CTGCGCX3GGA 120 

GOGCOGAAAT OGTGGCCGAG TTCTTCTCAT TCGGCATCAA CAGCATTTTA TATCAGCGTG 180 

GCATATATCC ATCTGAAACC TTTACTCGAG TQCAGAAATA CGGACTCACC TTGCTTGTAA 240 

CTACTGATCT IGAGCTCATA AAATACCTAA ATAATQTGGT QGAACAACTG AAAGATTGGT 300 

TATACAAGTO TTCAGTTCAG AAACTGGTTG TAGTTATCTC AAATATTGAA AGTGGTGAG6 360 

TCCTGGAAAG ATGGCAGTTT GATATTGAGT GTGACAA6AC TGCAAAAGAT GACAGTGCAC 420 

CCAGAGAAAA QTCTCAGAAA GCTATCCA6G. ATGAAATCCG TTCAGTGATC AGACAGATCA 480 

CAGCTAOSGT GAC^TTTCTG CCACTGTTGG AAGTOCTTG TTCATTTGAT CTGCTOATTT 540 

ATACAGACAA AGATTTGGTT GTACCTGAAA AATGGGAAGA GTCGGGACCA CAGTTTATTA 600 

CCAATTCTGA GGAAGTCCGC CTTCGTTCAT TTACTACTAC AATCCACAAA GTAAATAGCA 660 

TGGTGGCOTA CAAAATTCCT GTCAATGACT QAGGATGACA TGAGGAAAAT AATQ IAATTO 720 

TAATTTTGAA AT6TQGTTTT CCTGAAATC3V GGTCATCTAT AGTTGATATG TTTTATTTCA 780 
TTGOTTAATT TTTACATGQA OAAAACCAAA ATGATACTTA CTGAACTGTG TGTAATTGTT ' 840 

CCTTTATTTT TTTGGTACCT ATTT6ACTTA CCATGGAGTT AACATCATGA ATTTATTGCA 900 

CATTGTTCAA AAGGAACCAG GAGGTTTTTT TGTCAACATT GTGA TGTAT A TTCCTTTGAA 960 

GATAGTAACT GTA6ATGGAA AAACTTGTGC TATAAAGCTA GATGCTTTCC TAAATCAGAT 1020 

GTTTrGGTCA AGTAGTTTGA CTCAGTATAG GTAGGGA6AT ATTTAAGTAT AAAATACAAC 1080 

AAAOSAAOTC TAAATATTCA GAATCTTTGT TAAQQTCCTG AAAGTAACTC ATAATCTATA 1140 

AACAATGAAA TATTGCTGTA TAGCTCCTTT TGACCTTCAT TTCATGTATA GTTTTCCCTA 1200 

TTGAATCAGT TTCCAATTAT TTGACTTTAA TTTATGTAAC TTGAA CCTAT GAAGCA AT GG 1260 

ATATTTGTAC TGTTTAATGT TCTGTQATAC AflAACTCTTA AAAATCTTTT TTCATGTGTT 1320 

TTATAAAATC AAGTTTTAAG TOAAAGTGAG GAAATAAAOT TAAGTTTGTT TTAAAAAAAA 1380 
AAAAAAAAAA 



Seq ID NO: 259 Protein sequence: 
Protein Accession NP_002349.1 

1 11 21 31 41 SI 

iLlQLSREQG ITLRGSAEIV ABFFSPGIHS ILYQROIYPS ETPTRVQKYG LTLLVTTDLE 60 

LIKYLNNWE QLKDWLYKCS VQKLVWISN IBSOTVI»BRW QPDIBCDRTA KDDSAPRBK8 120 

QKAlQDEIRfi VIBQITATVT PLPLtEVSCS PDUiIYTDKD LWPBKMEES GPQFITHSEB 180 
VRLRSFTTTI HKVNSMVAYK IPVMD . 

Seq ID KO: 260 DNA sequence 
Nucleic Acid Accession #: NN_001211 
Coding sequence: 43.. 3195 

1 11 21 31 41 51 

AAAGGCCTGC AGCAGGACGA 6GACCTGAGC CAGGAATQCA GGATGGOGGC GGTGAAGAAG 60 

6AAGGGGGTG CTCTGAGTGA AGCCATQTCC CTGGAGGGAG ATGAATGGGA ACTGAGTAAA 120 

GAAAATQTAC AACCTTTAAG GCAAGGGCGG ATCATGTCCA CGCTTCAGGG AGCACPGGCA 180 

CAA6AATCTG CCTGTAACAA TACTCTTCAG CAGCRGAAAC GGGCATTTGA ATATGAAATT 240 
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CX5ATTTTACA 
CAGAACTATC 
GAAGCACTAC 
AAATTAGGGC 
ATTGGTGTTT 
AACTTTAGGA 
GAAAGACTAC 
GCACTTGAGA 
CTAGCTGAAC 
GGTGCTCTCA 
CAAAATAATA 
TTGTCTAAGC 
GAGCTGOUU? 
ACAGCTTCAC 
ACTGCACAAC 
AGCACCAGAA 
CAAGGSTCTO 
GTAGGGOAAT 
CAAAGGGAAG 
GAAOAGATGG 
CAGCAAGAAG 
C2U3AAAATAC 
GAAACTTCAC 
CCTTTCTCCA 
GATCCCCCAC 
ATCACCTCAA 
TTGAGCGAGG 
GACACTT6TG 
TCCTTGAAGG 
AAGACCTCTG 
AAGAAGCTGA 
GGTTCTTCTG 
CTAGftACTTA 
CGCAGACAGC 
GAC3U3AOCAA 
TGCATTAAAC 
AACTCTGCAG 
ATCAACCTCA 
CAATATCAAG 
CTTCTCCAAC 
TTGACAATAG 
TGTCTGATTC 
TTGAAGATAG 
CTCA6GGGCT 
TCTCCCTACC 
GAACACCTAC 
CTAAAAGATG 
GCCACAGTGT 
TTCCAAA6TC 

GTATTGTGGA 
CTACCATTGC 
ACAGTQATAT 
CAGACTCATT 
CTTTTCCCAT 
TAAAATAGTA 



CTG6AAATGA 
CTCAAGGTGG 
AAG6A6AAAA 
GTTTATGCAA 
CACTTGCTCA 
AAGCAGATGC 
AGTCCCAGCA 
AAGAAGAAGA 
TAAAGAGCAA 
A66CTCCAAG 
GTA6AATTAC 
CTACAGTCCA 
CAG6CCCTTG 
TGATAGCTGT 
AGCCA6TTAT 
AGCCTGGAAA 
AGQAGAAGAA 
TCTCXOTTGA 
CCGA6CTATT 
AGAAGAA6CT 
AGACGATGCC 
CA6GAATGAC 
TTGOGGAGAA 
TTTTTGATGA 
GAGTTTTAGC 
ATGAAGATGT 
ATGCCATTAT 
ACTTTGCCAG 
ATCTCCCTTC 
AGGACCAGCA 
QCCCAATTAT 
CCTCGGTTGC 
CTAATGAGAC 
TACTGAAGTC 
TGCCTAAGTT 
GAGAATACCT 
AATTAACAGT 
AGTTAAAGGA 
ATGQCTGTAT 
ACA6T6AATA 
TG6AGATCCT 
TCA6AAACAG 
TQGACTTTTC 
TTCGGACTGT 
AOGTAGAOCT 
AOGTCTTCTG 
GTGAATTGTG 
CTGTTCTTGG 
ACCTGAACAA 
AGT6A6CTAG 
ACACTGAAAC 
TGTTCTACTT 
ACTTACTCAT 
ACAAATGGTT 
TTGTAATTT6 
TCTGTTAAAA 



CCCTCTGGAT 
OAAAGAGAGT 
ACGATATTAT 
TGAGCCTTTG 
GTTCTATATC 
GATATTTCAG 
CC38ACAATTC 
GGA66AA6TT 
AGGGAAAAA6 
CCA6AACAGA 
TGTTTTTQAT 
G0CAT6GA3A 
GAACACA66C 
ACCC6CTGT6 
GACACCATGT 
GGAAGAAGGA 
AGA6AAGATG 
AGAAATTOOG 
GACCAGTGCA 
AAAAGAAATC 
TACAAAG6AG 
TCTATCCAGT 
CATTT6GCAG 
GTTTCTTCTT 
TCAAC6AAGA 
GTCTCCAGAT 
CACAGGCTTC 
AGCAGCTCGT 
T6ATCCTGAG 
GACAGCTTGT 
TGAAGACAGT 
AAGCACCTCC 
TTCAGAAAAC 
CCTACCA6AG 
0GAAATT6AG 
AATATGT6AA 
AATAAAGGTA 
ACX3TTTAAAT 
TGTTTGGCAC 
TATTACCCAT 
ACACAAAGCA 
AATCCA06AT 
CTACAQTGTT 
ACAGATCCTG 
GTTTGGTAXA 
GGATGG6TCC 
GAATAAATTC 
GGA6CTTGCA 
AGCCTTATGG 
GCAATCAAGT 
TQTATGTCCT 
TTTGGTACAG 
GGCCTTGTCT 
ACCTTGTTAT 
TAAAATGTTC 
AAAAAAAAAA 



6TTTGGGATA 
AATATGTCAA 
AGTGATCCTC 
6ATATGTACA 
TCATGGGCA6 
6AAG68ATTC 
CAAGCTOQAO 
TTTQAQTCTT 
ACAGCAAGAG 
G6ACTCCAAA 
GAAAATGCT6 
GCACCCOCCA 
A6GTCCTTGG 
CTTCCCAGTT 
AAAATTGAAC 
GATCCTCTAC 
ATQTATTQTA 
6CTGAA8TTT 
GAGAA6AGA6 
CAAACTACTC 
ACAACTAAAC 
TCT6TTTQTC 
GAACAACXrrC 
TCAGAAAA6A 
CCCCTTGCAG 
GTTTGTGAT6 
AGAAATGTAA 
TTTGTATCCA 
A6ACTGTTAC 
GGCACTATGT 
06TGAAGCCA 
TCCATCAAAT 
CCTACTCAGT 
TTAAGTGCCT 
AAGGAAATT6 
GATTACAAGT 
TCTTCTCAAC 
GAAGATTTTG 
CAATATATAA 
GAAATAACAG 
GAAATAGTCC 
CCCTATGATT 
GACCTTAGGG 
GAAGGACAAA 
GCAQATTTAa 
TTCTGQAAAC 
TTTGTGCGGA 
GCAGAAATGA 
AAG6TAGGGA 
CTCACA6ATT 
GTAATTTAAT 
GTATATTTTG 
AACTTTTGTG 
TTAACCCATT 
TCTTATGATC 
AAAAAAAAAA 



Seq ID NO; 261 Protein sequence! 
Protein Accession »i NP_001202 



MAAVKKBGGA 
APEYEIRPYT 
FLNLWLKLGR 
QKAEPLERLQ 
PIIRVGOALK 
PRAKENELQA 
SINHIL8TRK 
RKKLKEQREA 
QIASESQKIP 
mCSPPADPPR 
ICPHPEDTCD 
SQTLSIXKLS 
PWCSQYRRQL 
FHVAPRNSAE 
CFTLQDLLQH 
NKKNQALKIV 
HLLIiFKEHLQ 
GVFDTTPQSH 



11 

1 

LSEAMSLG6D 
GNDPLDVWDR 

lcnepldmys 
sqhrqfqarv 
apsqnrglqn 
gphntgrsle 
p6keegdplq 
elltsaekra 

GMTXiSSSVOQ 
VLAQRRPIiAV 
FARAARFVST 
PIZBDSREAT 
LKSLPELSAS 
LTVIKVSSQP 
SEYITHEITV 
DFSYSVDIiRV 
VFHD6SFWKL 
LNKALWKVGK 



21 

1 



YISWTEQNYP 
YLHNQGIGVS 
SRQTLLALEK 
PFPQQMQNNS 
HRPRGNTASL 
RVQSHQQASB 
ratQKQIEEME 
VNCCARETSI* 
LXTSBSITSH 
PFHEIMSLKD 



31 

1 

PLRQGRIMST 
QGGKESMMST 
LAQPYISWAE 



AELCIEDRPM 
VPWDFYINLK 
LIIYNLLTIV 
QLDVFTLS6F 
SQNISELKDG 
LTSPGALLFQ 



RITVFDBNAD 
ZAVPAVIiPSF 
EKKEXMHYCK 
KKLKEIQTTQ 
AEtTIWQEQPH 
EDVSPDVCDE 
LPSDPERIiIiP 
SVASTSSIKC 
PKLEIEKEIE 
IiKERLNEDPD 
EMiSKABIVH 
RTVQILEGQK 
ELWNKPFVRI 



GGTATATCAG 
CGTTATTAGA 
GATTTCTCAA 
GTTACTT6CA 
AAGAATATGA 
AACA6AAG6C 
TGTCTCGOCA 
C?rGTACCACA 
CTCCAATCAT 
ATCCATTTCC 
ATGAGGCTTC 
TGCCC31GGGC 
AACACAGGCC 
TCACTCCATA 
CTAGTATAAA 
AAAGG6TTCA 
AGGAGAA6AT 
TC066AAGAA 
CAQAAATGCA 
AGCAAGAAAG 
TGCAAATTGC 
AAGTAAACTG 
ATTCTAAAOG 
AGAATAAAAG 
TTCTCAAAAC 
AATTTACAGG 
CAATTTGTCC 
CTCCTTTTCA 
C6QAAGAA6A 
ACAGTCAGAC 
CACACTCCTC 
GTCTTCAAAT 
CACCATG(?rG 
CTGCAGAGTT 
AATTAGGTAA 
TATTCTGGGT 
CTGTCCCATG 
ATCATTTTT6 
ACT6CTTCAC 
TGTTGATTAT 
ATG6TGACTT 
GTAACAA6AA 
T6CAQCT66A 
AGATCCTG6C 
CACATTTACT 
TTAGCCAAAA 
TTCTGAATGC 
ATGGGGTTTT 
A6TTAACTAG 
6CT6CCTCAG 
TTAGGACACA 
AOGTCACTQA 
AAGAACTATT 
TGTCTCTACT 
ACCATGTATT 
AAA 



41 

I 

LQGALAQESA 
LIiERAVEALQ 
EYEARENFRK 
VPQRSTLAEL 
EA5TAELSKP 
TPyVEETAQQ 
ERIYA6VGEP 
QERTGDQQEE 
SKGPSVPFSI 
FT6IEPLSED 
EEDXiDVRTSE 
LQIPBKLELT 
LGNEDYCIKR 
HFCSCYQYQD 
GDLSPRCLIL 
lUVNCSSPYQ 
INANDBATVS 



CTGOACAGAG 
AAGAGCTGTA 
TCTCTGOCTT 
CAAGCAAGGG 
AGCTAGAGAA 
TGAACCACTA 
AACTCTGTTG 
AGGAA6CACA 
CCGTGTAGGA 
TCAACAGATG 
TACAGCAGAG 
CAAAGAGAAT 
TCGTGGCAAT 
T6TGGAAGAG 
CCACATCCTA 
GAGCCATCAG 
TTATSCAGGA 
ATTAAAAGAG 
GAAACAGATT 
AACAGGTGAT 
TTCCGAGTCT 
TTGTGCCAGA 
TCCCRGTGTA 
TCCTCCTGCA 
CTCAGAAAGC 
AATTGAACCC 
TAACCCA6AA 
TGAGATAATG 
TCTAGATGTA 
TCTCAGCATC 
TGGCTTCTCT 
TCCTGAGAAA 
TTCACAGTAT 
GTGTATAGAA 
TGAGGATTAC 
GGOGCCAAGA 
GGACTTTTAT 
CAGCTGTTAT 
CCTTCAGGAT 
TTATAAGCTT 
GAGTCCAAGO 
CAATCAAGCT 
TGTTTTTACC 
TAACTGTTCT 
ATT6TTCAAG 
TATTTCT6AG 
CAATGATGA6 
TGACACTACA 
TCCTGGGGCT 
A6CAATGGTT 
TTTAGATGCA 
TATTTTTTAT 
TTATTCTAAA 
TTTCCCTGTA 
TTGTAAATAA 



51 

1 

CNNTLQQQKR 
OEKRYYSDPR 
ADAIFQE6IQ 
KSKGKKTARA 
TVQPWIAPFM 
FVNTPCKZEP 
SFEEXRAEVF 
TMPTKETTKL 
FDEFLIiSEKK 
AIITGPRNVT 
DQQTACGTIY 
NETSENPTQS 
BYLICEDYKL 
GCrVWHQYIN 
RKRIEDPYDC 
VDIiFGIADLA 
VLGELAAENN 



300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
22B0 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 



Seq ID NO: 262 DNA sequence 
Nucleic Acid Accession ft: NM_003784 
Coding sequence t 3 65.. 1507 " 



11 



21 



31 



41 



51 



287 



wo 02/086443 

I I I 

OTCTACTTAT CAATAAGC31B CTGCCTGTGC 
TAAAACTGAA TTCTCAGAAT TTTAGAACAA 
ATTAGGTAQT GGTAAAACAG GCTCCCTTCG 
TGTACAGGGA AGCTCTCCTT CATCACCTTC 
CAACAGTCTT GAGAAGTGTG GAAACATTTT 
AAGGAAACCA GATTCCCATC ACTGCTTCTO 
TGCAATC3GCC TCXXTTTGCTG CAGCAAATGC 
GGATGACAAT CAAG3AAATG GAAATGT6TT 
QGCCCTGGTC CGCTTGGOCG CTCAAGATGA 
TGTTAACACT GCCTCAGOAT ATG6AAACTC 
ACTGAAAAGA GTTTTTTCTG ATATAAATGC 
GAATGGGCTT TTTOCTGAAA AAGTGTATGG 
AAAATTATAC GATGCCAAAG TGGAGCGAGT 
ACGTAATATT AATAAGTGGG TTGRAAATGA 
TGAAGQTGGC ATAAGCTCAT CTGCTGTAAT 
CAAGTGGCAA TCAGCCTTCA CCAAGAGCGA 
GTGCTCTGGG AAG6CAGT0G CCATGATGCA 
TGAGGACCCA TCAATGAAGA TTCTTGAGCT 
TCTGCTGCCT GAGAATGACC TCTCTGAAAT 
GGAATGGACC AATCCAAGGC GAATGACCTC 
CAAGATAGAG AAGAATTATG AAAT6AAACA 
CTTTGATGAA TCCAAAGCAG ATCTCTCTGG 
AAGGATGATO CACAAATCTT ACATAGAGGT 
CACAGGAAGT AATATTQTAO AAAAGGAACT 
CCCATTCCTA TTTQTTATCA GOAAOGATGA 
CCCTTGAAAA TCC3VATTGGT TTCTGTTATA 
AAGTCAATAO ATYTGRGTTT AATTGGAAAA 
CTAACATTGG TCAGCA6ATG ACACTGGT6A 
TCCTGATCGC TGCTCTTAGC ATTCTACCAC 
TCTTTCTTCC CACGCTCATT TCTATCATTC 
RGTGCTCAAC TGGTAAGGAG AAOSTAGAAG 
AGTTATCGCA GATATTCTAG CTTCATTGTA 
CTAGAAATAA GTGTGAAGGA TAAATTTTCT 
CTTCATAiar TTGATTTTAA ATCA0TGTAT 
ATTAGGGACC TACX31AAATA TTTCATTAAT 
TGATAAGACA ATATGTACAT GTTTTTTCAA 
GTTATCTACA GAATCATATT TCATATGCTG 
AGAATAAAGA AATACAACAT ACCTGTAAA 

Seg ID NO: 2 £3 Protein sequence: 
Protein Accession ttt NP_003775 



} I I 

AGAGTGCAGG CTGCACCm GOACAGCCTT 60 

ATTTTTGTCT AGAAATGCTG ACTTT6GTTC 120 

AAGCTCTCCT TCATCACCTT CCTAAGTGCA 180 

CTAAGTGCAT GGGGGAAAAT ACCTAG6GCT 240 

CTTTGTGAGT 6AGAACAGAT CACCTAGAGA 300 

QGTATCAOAT GCTAGCGCTG CACTCCATTT 360 

AGAGTTrTGC TTCAACCTGT TCAGAGAGAT 420 

CTTTTCCTCT CTGAGCCTCT TCGCTGCCCT 480 

CTCCCTCTCT CAGATTQATA AGTTGCTTCA 540 

TTCTAATAGT CAGTCAGG6C TCCaOTCTCA 600 

ATCCCACAAG GATTATGATC TCAGCATTGT 660 

CTTTCATAAG GACTACATTQ AQTGTGCCGA 720 

TGACTTTA06 AATCATTTAG AAGACACTAG 780 

AACACATG6C AAAATCAA3A AOQTOATTGG 640 

6GT0CTGGTG AATG C TOTOT ACTTCAAA06 900 

AACCATAAAT TGCCATTTCA AATCTCCCAA 960 

TCAGGAAOGG AAGTTCAATT TGTCTGTTAT 1020 

CA6ATACAAT G6TGGCATAA ACATGTACGT 1080 

TQAAAACAAA CT6ACCTTTC AGAATCTAAT 1140 

TAAQTATOTT GAGGTATTTT TTCCTCAGTT 1200 

ATATTTGAGA GCCCTAGGGC TGAAAGATAT 1260 

GATTGCTTCG GGGGGTCGTC TGTATATATC 1320 

CACTGAGGA6 GGCAC06AG6 CTACTGCTGC 1380 

CCCTCACTCC AOGCTQTTTA QAOCIQACCA 1440 

CATCATCTTA TTCA6TGGCA AAGTTTCTT6 1500 

GCAGTCCCCA CAACATCAAA QRACCACCAC 1560 

AT6TGGTGTT TCCTTTGAGT TTATTTCTTC 1620 

CTTGACCCTT CCTAGACACC TGGTTGATTG 1680 

CATGTGTCTC AOCCMTTCT AATTTCATTG 1740 

TCCCCCATGA GCX3GTCTG6A AATTATGGAG 1800 

TAGCCCTAGG GATCCTTTTT GAAACTCTAC I860 

AGCAATCTAQ GAAATAAGCC CTGCTGCTTT 1920 

rPGTTGACCT ATGAAGATTT TAGAGTTTAC 1980 

AATCTAGATQ 6TAAAAAAT6 TGAAATTGGQ 2040 

GCTTTCAATT GACAAATTTT GGCCTTTCTT 2100 

ATATTAAAGA TCTTTTAACT GTTG6CAGTT 2160 

TGTAGTTTAT AAGTTTTTCC TCTATTTATC 2220 



I 11 21 31 41 51 

I I I I I i 

MASLAAAKAS FCFNLFREMD DNQGKGNVFF 5SLSLFAALA LVRLGAQDDS LSQIDKLLBV 60 

NTASGYGNSS NSQSGLQSQL KRVFSDINAS HKDYDLSIVN GLFAEKVYGP HKDYIBCABK 120 

LYX3AXVERVD FTMHLEDTRR NINKWVENET HGKIKNVIGB GGISSSAVMV LVNAVYFKGK 180 

WQSAPTKSET IHCKFKSPKC 8GKAVAMHHQ BRKPHLSVIB DPSMKILELR YNGGINMYVL 240 

liPENDLSBIE NKLTFOtlLME WTNPSRMTSK YVEVFFPQFK lEKNYEMKQY LRALGLKDIF 300 

DBSKADLSQI ASGGRLYISR MHHKSYIEVT EEXSTEATAAT GSKIVEKQLP QSTLFRADHF 360 
FLFVIRXDDI ILFSGKVSCP 



Seq ID NO: 264 DHA sequence 
Nucleic Acid Accession #x AB052906 
Ooding sequence: 74-814 

1 11 21 31 41 51 

I I I I I 1 

AAAAOCTTGA GGTGATTCAT CTTCCAGGCT CTCCITCCAT CAA6TCTCTC CTCCCTAGCG ' 60 

CTCTGGGTCC TTAATGGCAG C3VGCCX3CCGC TACCAAGATC CTTCTGTGCC TCOXfCrTCT 120 

GCTCCTGCTG TCCGGCTGGT OCCGGGCTGG GCGAGCCGAC CCTCACTCTC TTTGCTATGR 180 

CATCACCGTC ATCCCTAAGT TCAGACCTGG ACCAOGGTGG TGT6C3GGTTC AAGGCCAGGT 240 

GGATQAAAAG ACTTTTCTTC ACTATGACTG TGGCAACAAG ACAGTC3VCAC CTGTCAGTCC 300 

CCTGGGGAAG AAACTAAATG TCACAACGGC CTGGAAAGCA CAQAACCCAG TACTGAGAGA 360 

GGTGGTGGAC ATACTTACAG AGCAACT6C6 TGACATTCAG CTGGAGAATT ACACACCCAA 420 

GQAACCCCTC ACX:CTGCAGG CCAGGAT6TC TTOTQAGCAO AAAGCTGAAG GACACAGCAG 480 

TGGATCTTGG CAGTTCAGTT T0GATGG6CA GATCTTCCTC CTCTTTGACT CAGAGAAGAG 540 

AATGTGGACA AOGGTTCATC CTGGAGCCAG AAAGATQAAA GAAAAGTGGG AGAATGACAA 600 

GGTTGTGGCC ATGTCCTTCC ATTACTTCTC AATGGGAGAC TGTATAGGAT GGCTTGAGGA 660 

CTTCTTGATG GGCATGGACA GCACCCTGGA GCXSUMSTGCA GGA GCACC AC TOQCCATGTC 720 

CTCAG6CACA A0CC3UVCTCA GGOCCACASC CACCACCCTC ATCCTTTGCr COCTCCTCAT 780 

CATCCTCCCC TGCTTCATCC TC3CCTGGCAT CTGAGGAGAG TCCTTTAGAO TGACAQGTTA 640 

AAGCTGATAC CAAAAGGCTC CTGTGAGCAC GGTCTTGATC AAACTCGCCC TTCTGTCTGG 900 

CCAGCTGCCC ACGACCTACG GTGTATGTCC AGTGGCCTCC AGCA6ATCAT GATGACATCA 960 

TGOACCCAAT AGCTCATTCA CTGCCTTGAT TCCTTTTGCX: AACAATTTTA CCAGCAGTTA 1020 

TACCTAACAT ATTATGCAAT TTrCTCTTGO TOCTACCIOA TGGAATTCCT GCACTTAAAO 1080 

TTCTGGCT6A CTAAACAAGA TATATCATTT TCTTTCTTCT CTTTTTGTTT G6AAAATCAA 1140 

GTACTTCTTT GAATGATGAT CTCTTTCTTO CAAATQATAT TGTCAGTAAA ATAATCAOOT 1200 

TAGACTTCAG ACCTCTGGGG ATTCTTTCCG TGTCCTGAAA GAGAATTTTT AAATTATTTA 1260 

ATAAGAAAAA ATTTATATTA ATGATTQTTT CCTTTAGTAA TTTATTGTTC TGTACT6ATA 1320 
TTTAAATAAA GAGTTCTArT TOCCAAAAAA AAAAAAAAAA A 

Seq ID NO I 265 Protein sequence: 
Protein Accession fti BAB61046.1 



288 



wo 02/086443 PCT/US02/12476 

1 11 21 31 41 51 

I I i I I i 

HAAAAATKIL IiCLPLLLLLS 6WSRA6RADP HSLCYDITVI PKFRPGPRWC AVQ6QVDSKT 60 
_ FLHYDOGNKT VTPVSPLGKK LNVTTAWKAQ NPVLRBWDI LTEQLRDIQL ENYTPKEPLT 120 
D LQARMSCEQK ABGHSSGSWQ PSFDGQIPLL FDSEKRMWTT VHPGARKMKE KWENDKWAM 180 

SPHYF6MGDC IGWLEDFIjMG MDSTLEPSAG AFLAMSSGTT QLRATATTLI IiCCIiLIILPC 240 

FZLPQI 

Seg ID HO: 266 DNA sequence 
10 Nucleic Acid Accession «: XM_0848S3.1 
Coding sequence: 127-444 ~ 

1 11 21 31 41 SI 

I I t 1 I I 

IJ ATTGATGATA TATTTAACGA AATCAAATTT GGT6AATATG TGGACACTGG AAAGCTAATC 60 

GACAAGATCA ACTTACCAGA TTTCCTAAAA GTGTACCTTA ACCACAAGCC AOCTTTTGGT 120 

AACACCAT6A GTGGCATCCA CAAGAGCTTT GAGGTGCTGO GTTATACXAA CTCCAAAGGG 180 

AAAAAGGCXa TTCGAAOAGA GGACTTCCTG AQACTGCTOS TTACTAAAGG TGAGCATAT6 240 

AOSGAGGAGG AGATGTTGQA TTGCTTTGCT TCACTGTTTG GCCTGAATCC CX3AGGGATGG 300 

ZK) AAATCCGAGC CTGCAACCTG CTCCGTCAAA GGTTCAGAAA TTTGCCTTGA AGAAGAACTT 360 

CCA6ACGAAA TCACTGCAGA AATATTCGCG ACTGAAATTC TTGGCTTAAC CATTTCAQAA 420 

GATTCCGGCC AGGAT6GTCA QTGAAGTTAC CAGGAATGTT TAAAGCACAA AGGACTTTGG 480 

GTGTGTGTGC ATGCACATGT GTGTGTTTTC CATGAGGCAC TGCTTTTTAT GCATTTCCCT 540 
CCCCCCTCrC ATCTTTAGAA CATTTAGACA TTAAAGCAA6 TTTCTGGTQA GCAATG 



25 
30 
35 



45 
50 



65 



85 



Seq ID NO: 267 Protein sequence t 
Protein Accession #: XP_084853.1 

1 11 21 31 41 51 

i I I I I I 

MS6IHKSPEV LGYTNSKGKK AIRREDFLRL LVTKGEHNTE EEMLDCPASL FGLNPEGWKS 
EPATCSVXOS^ BIOiEEBLPD EITAEIFATE ZLSLTISEDS GQDGQ 

Seq ID MOt 268 OHA sequence 

Nucleic Acid Accession »: MM_001B98 
Coding sequence: 57-482 ~ 



Af\ ^ 11 21 31 41 51 

40 I I I I 1 I 

GGCTCTCACC CTCCTCTOCT OCAGCTCCAG CTTTGTGCTC TGOCTCTGAa GAGACCaiGG 60 

CCCA6TATCT GAGTACCCTG CTGCTCCTGC TGGCCACCCT AGCTGTGGCC CTGGCCTG6A 120 

GCCCCAAGGA GGAGGATAGQ ATAATCCOSG GTGQCATCTA TAACGCAGAC CTCAATGATG 180 

AGTGGGTACA GCGTGCCCTT CACTTCGCCA TCAGCGAGTA TAACAAGGCC ACCAAAGATG 240 

ACTACTACAG ACGTCCGCTG CGGGTACTAA QAGCCAGGGA ACAGAOCGTT GGGGGGGTGA 300 

ATTACTTCTT 0GA06TAGAG GTGGGCCGCA CCATATGTAC CAAGTCCCAG CXXSU^CTTGG 360 

ACACCTGTGC CTTCCATGAA CAGCCAGAAC TGCAGAAGAA ACAGTTGTGC TCTTTC5GAGA 420 

TdAOG AAar TOCCTGGGAG AACAGAAGGT CCCTGGTGAA ATCCAGGTGT CAAGAATCCT 480 

AGGGATCTQT GCCAGGCCAT TCGCACCAGC CACCACXXavC TCCCACCCMC TQTAGTGCTC 540 

CCACCCCTGG ACTGGTGGCC CCCACCCTGC GGGAGGCCTC CCCAIGTGCC TGCGCCAAQA 600 

GACAGACAGA GAAGGCTGCA GGAGTCCTTT GTTGCTCAGC AGQGCGCTCT GCCCTCCCTC 660 

CTTCCTTCTT GCTTCTAATA GCCCTGQTAC ATGGTACACA CCCCCCCACC TCCTGCAATT 720 
AAACAGTAGC ATC6CC 

55 Seq ID NO: 269 Protein sequence: 
Protein Accession ff tNP_001889.1 

1 11 21 31 41 51 

^ 111 1 I 

OU MAQYLSTLLL IiLATLAVALA HSPKEEDRII PGGZmADUl DEHVQRAIiHF AISEYNKATK 60 
DDYYRRPLRV LRARQQTVG6 VNYFFOVEVO RTICTKSQPtt LDTCAFHEQP ELQKKQLCSP 120 
EIYEVPWBHR RSLVKSROQE 5 



Seq 10 KO: 270 DNA sequence 
Nucleic Acid Accession «t XM_093210 
Coding sequence: 13-1854 " 



"1 11 21 31 41 51 

70 > i I I i I 

/U ATGGCAAGCG CQGGAATCTC CTCAGCTGCC GTTTCACAAA AGAGGTACCA GGTCCGCACC 60 

AAAOQAGCAC ACAAGCAGCA CCAGGA6CTG CAGAAGAAG6 AGGGGGCAGC GATGGACCAG 120 

66CAGAGG6A ATGGGGAGGG GGCATCCTAC CCCATATCTG AGGTGCGACT GCGGGACGTA 180 

QAGCOQACTG GGCCTTTCCC GTTGGCGOST GGCCTCAATC AGGACTTCTT GCCCACGTGC 240 

GCCTTCAAAA OGGTAAGAGC TGCAACTGAA CGTGTGAGAC ATGGTGCAGA TAGGCTGAGA 300 

/D GGCGGCGGGA GAGATGCCCA TGAACTCAAG TACCC6GACA CQCCCTCCAC TTCTACCACC 360 

ACQAOTAACA CCQCCCCCAC GGGACCGCTC TCGAGSTCCC CCAAflCCAAG QACGCAAGGA 420 

GGAAG6CCCC GGCX3C6CG6C CAGCAGGQGC GG6CACCG6C CCAATGGCCA CGGAACTCA6 480 

CACTGGCAGT CGGCCCTCCT CACACCQCAG GCGTGCAGTG TGGCCGACGG AGCCTCCCGG 540 

GCCGAGGACC CAGCTAGGCC GTCACCCCGQ TTQCTCCCAC GGGAAGGGGC ACCAGGCAAA 600 

oU CTGCCCAAGG CCCOGAGCCC AGGCTCCCTG GCGGAGGCCT CCGCTGGTCC 06CCCAGATC 660 

ATGGCCQCCA CCAGGCTCCC OAGCCATOGC TTOCTGTCCG G6AAOGQC0C GGOSTCCTCG 720 
CTGTCCAOCT AG 



Seq ID NO: 271 Protein sequence: 
Protein Accession #: XP 093210 



31 41 51 



289 
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WO 02/086443 

I I I I I I 

MLRHGEQKRK HARKKWDFLP TCAFKTVRAA TERVHHQADR LRGGOHDAHE LKYPDTPSTS 
TTTSNTAFTG PLSRSPKPRT QGGTPRRRPA AA6TRANGRG TQHWQSALLT PQACSVADGA 
SRAEDPARPS PRLIiPREGAP GXLPKAPSFG SLAEASA6LL AHVRLQNADA QRVSISQALP 
nSSSVGRKEB RPaAOQQRRA PAFHATSLST 6SRPSSHRKR AVHPTEPFOP RTQLBPSPRL 
LPRBOAPGRIi PKAPSP6SIA EASAOPAQIM AATRLPSRGF LSGIKSPASHL 8S 

Seq ID NO: 272 DMA sequence 

Nucleic Acid Accession #: Eos sequence 

Coding sequence: 1..732 



PCT/US02/12476 



GGATACTGTG 
TQAAAAAGCT 
TAATGTGGAG 
ATACCXACTT 
ATGATTTTGT 
TAAATTATTT 
TTAGTATCAC 
AAAATT6CAG 
TTAA6CC 



11 
1 

TCACTCAAAG 
TTTTTTCCCA 
GAAATTATTC 
GAAGCCTCTG 
CTTGTTTCTG 
TTATTTATCT 
AATTTATGGQ 
AAGTCATAGG 



21 

1 

TAATGGGAGG 
CTTTTAACTT 
TTTCTCATTG 
TAGAAATGTC 
CAGTGAGAAA 
TTCATATAGT 
AGAGGGTTTT 
ACTOTCATGT 



31 
I 

GAGAGAGAAC 
GCTTTAOGCT 
GAGATTACAG 
TC6TCCTCCG 
TTACATCCAT 
TCTTACAATT 
TTGTATTTTT 
ATT6CAGCTC 



41 
I 

AGGGAGGGTA 
TAAOAGTACT 
AATATATCTA 
GTTGTATTTC 
AGCAAAGACA 
TCTAAAAAAT 
AAGCATATGT 
TGAGAACCAA 



Seq ZD KOi 373 Protein sequence i 
Protein Accession ft: Bos sequence 

1 11 21 31 

I 1 I 1 

MGGRENREGR DAFEKAPPPT PNLL 

Seq ID NO: 274 DNA sequence 

Nucleic Acid Accession #: NM_003976.2 

Coding sequence: 299-961 



41 

I 



51 
I 



1 

1 

CTCTGAGCTT 
CATGGAOTTG 
CTACTTCTGC 
GG6TGGCAGG 
CAGGAGGGTG 
GQAACTTGGA 
TGCCCTGTGG 
GGGCTCCGCG 
C6CG60CCAC 
GCGGCOGCAG 
GGGG66CC6C 
GGGCT6CCGC 
CGAC6AGCTG 
OGACCTCAGC 
GCCCGTCAGC 
CAACAGCACC 
A6G6CTCGCT 
CCTCCOGCAG 
AGQCCCCTAC 
CAGCCCCAGA 
GGAGCCCTTC 
CCCTCCTCTG 
ACAGCATTTC 
CCTGTACTCA 



11 

1 

CTCTGAGCCT 
TGAAAGAATA 
TGGGTTGAGT 
COGGTCCCCC 
GGG6AACA6C 
CTTGGA6GCC 
CCCACCCTGO 
CCCCGCAGCC 
CTGCCGGGGG 
CCTTCTCGGC 
G0GGCGCGG6 
CTGCGCTCGC 
GTGOGTTTCC 
CTGGCCAGCC 
CAGCCCTGCT 
TGQAGAACCO 
CCAGiGGCTTT 
AGTCCCACTA 
CGGTGGGTQA 
GCCCrCACCC 
GGACCCACTT 
AT6AACACTA 
AAG6ACACAT 
CTCATG0GA6 



21 

I 

TGTTTGCTCA 
OCTGCAAAGC 
CTAGCTQTGT 
ACAAAA6ATA 
TCAACAATGG 
TCTCCACGCT 
CCGCTCTGGC 
CTGCCCCCCG 
GACGCACGGC 
CCGCX3CCCCC 
CTGGGGGGCC 
AGCTGGTGCC 
GCTTCTGCAG 
TACTGGGOGC 
6CGGACCCAC 
TGQAC06CCT 
GCAGACTG6A 
GCCAGC6GCC 
TGGATATCAT 
TGCGGATCCC 
CTCACAGACT 
CAGTGGCTOA 
ATTGCAGTTG 
CTGGCCCC 



31 

1 

TCTGGAAAAA 
ACCTAACACA 
AGGCCCCTTG 
ACTCATCTCT 
CTGATGGGOG 
GTCCCACTGC 
TCTGCTGAGC 
CGAAGGCCCC 
CCGCTGGTGC 
GCOGCCTGCA 
GGGCAGCCGC 
GGT6CGCG06 
CGGCrCCTGC 
CGGGGCGCT6 
GCGCTACGAA 
CTCCGGCAQC 
CCCTTACCG6 
TCAGCCAGGG 
CCCC3GAACAG 
A6CCTAAAAS 
CTGGCACTGG 
66CATCAGCC 
CTTG6TT6AA 



41 

I 

GGG6ATTAAA 
TAGTAAGGTT 
TTCCTCACCT 
TAATTTGCAA 
CTCCTGGTGT 
CCCTGGCCTA 
AGC6TCGCA6 
COGCCTGTCC 
AGTGGAAGAG 
CCCCCATCTG 
6CT0GGGCA0 
CTO6GCCTG0 
CGCOGCGOGC 
CGACCGCOCC 
G0G6TCTCCT 
GCCTGOGGCT 
TGGCrCTTCC 
ACGAAGGCCT 
GTGAAGGGAC 
ACACCAGAGA 
CCAGGCCTOO 
C00GCCCAQ6 
AGTGCCTGTG 



51 
I 

CCATTTACCT 
CCCA6TGCAG 
GGAGAAACTG 
GCTGCCTCAA 
TGATAGAGAT 
GGCGGCAGCC 
AGGCCTCCCT 
TGGCGTCCCC 
CCCGGCGQCC 
CTCTTCCCCQ 
GGGGGGGQOQ 
GCCAC06CTC 
GCTCTCCACA 
CGGGCTCCOG 
TCATGGACGT 
GCCTGGGCZG 
TCCCTG6GAC 
CAAAGCTGAG 
AACTGACTAG 
CCTCAGCTAT 
AACCTGGGAC 
CCCT0TAG6G 
CTG6AACTGG 



Seq 10 NO: 275 Protein sequence: 
Protein Accession #: NP_003967.1 



1 11 21 31 41 51 

I 1 1 I 1 I 

MELGLGGLST LSHCPWFRRQ PALHPTIiAAIi AltLSSVAEAS LGSAPRSPAP RBGPPPVLA8 
PA8HLPG6RT ARHC8GRARR PPPQPSRPAP PPPAPPSALP RGGRAARAGG F6SRARAAGA 
RGCRLRSQLV FVRAIiGXiGHR SDBLVRFRFC SG8CRRARSP BDLStASLLG AGALRPPPGS 
RPVSQPCCRP TRYEAVSPMD VNSTWRTVDR LSATACQCLG 

Seq ID NO: 276 DNA sequence 

Nucleic Acid Accession #: NM_057091.1 

Coding sequence: 763-1445 



ACTGGCC6CT 
GGACCCCCAA 
TCGCTCCCCG 
CG06TGTCTA 
CTCCATATCC 
CAAGCTAGGG 
CGGGGCAGGG 
CACCGGACGG 
CAGACAAGGC 



11 

I 

GABAOAAGAA 
ATCTGCACGT 
CCCTCACTCA 
CAAACTCAAC 
GAGGGGCCCC 
GGGACTGGAT 
GOGCTCCCAG 
CTG06GCGGC 
COGGGGGCTC 



21 
I 

TGOGGTOGAO 
ACCAGCAGTC 
CTTTCTCCCG 
TCCCGGTTTC 
TCCCAGCATC 
CGGACGGGTG 
CCCCACCCCG 
GGGCAGGAGG 
C6CCAGCAGC 



31 
I 

CAGAGAGCAG 
AGCCGCCCCA 
CCCTCQGCCC 
CGTGCCrCTC 
TACCCCCCTC 
GAGCAGCCAG 
GGATCTGGTO 
CTGCT6AGGG 
A6GTCCCT0G 



41 

1 

CTGCTGCAGG 
OGCAGGGACC 
GGCCTCCCAG 
CACCGCTCGA 
CCAACCTCG6 
GTGA6CCCCG 
AG6CTGGGGC 
ATGGAGTTGO 
GGCCCCAGCC 



51 
I 

GCAGACAGCC 
GGCTTACCCC 
CTCTCTACTT 
GTTCTCTACT 
GGGACCTAGC 
AAAGGTGGGG 
TGQAATTTQA 
GCCOGGCCCC 
CTCGCTGCCA 



60 
120 
180 
240 



51 
I 

GGGATGCTTT 
TACCA6CTAA 
TTGATCTTGA 
TAAAACCTAC 
AAAGTCTTTT 
TAACACTCAT 
GGCTTAXATA 
TGGCTGAAAC 



60 
120 
160 
240 
300 
360 
420 
460 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1360 



60 
120 
180 



60 
120 
180 
240 
300 
360 
420 
480 
540 
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CCCG6GCCTG OAQCCCCACA CCCGAGGGTG CAGACTGQCT GCCAAGQCCA CACTTTTGGC 
TAAAAGAOGC ACT6CCA6GT GTACAGTCCT GG6CATGCJQC TQTTTGAOCT T0GGGG6AGA 
GCCCAGCACT GGTCCCCGGA AAGGTGCCTA QAAGAACAA6 GTGCAGGACC CCGT6CTGCC 
TCAACAGGAG GGTGGGGGAA CAGCTCAACA ATGGCTGATO GGCXSCTCCTG GTGTTGATAG 
AGATGGAACT TGGACTTGGA GGCCTCTCCA CGCTGTCCCA CTGCCCCTGG CCTAGGCX3GC 
AGCCTGCCCT GTGGCCCACC CTG6CCGCTC TGGCTCTGCT GAGCAGCGTC GCAGAGGCCT 
0CCT6GGCTC CGOSCCCOQC AQCCCTQCCC CGGGCXaAAGG CCCCCG8CCT GTCCTGGGGFT 
CCCCCGCCGG CCACCTGCCG GGGGGACGCA CGGCCCGCTG 6T6CAGTG6A AGAGCC0G6C 
GGCCGCCGCC GCAGCCTTCT CGGCCCGGGC CCCOGCCGCC TGCACCCCCA TCTGCTCTTC 
CCCX3CGGGGG CCGOGCGGCG CGGGCTGGGG GCCCGGGCAG CCGCGCTCX3G GCAGCGGGGG 
(SXSGGGGCTG CGGCCTOOGC TCGCAGCTGG TGCCGGTG06 OGOGCTCGGC CTGQGCC3«:C 
GCTCOGACSQA GCTG6TG0GT TTCOGCTTCT 6CAGCXK3CTC CTGC06C0GC GaSOGCTCTC 
CACACGACCT CAGCCTGQCC AGCCTACTGO 6CQCCGGQQC CCTQOSACGG CCCCOGQGCT 
CCCGGCOCGT CAGCCRGCCC TGCTGCCXSAC CCAOGOGCTA C3QAAGCGGTC TCCTTCATGG 
ACGTCAACAG CACCTGGAGA ACCGTGGACC GCCTCTCC6C CACCGCCTGC GGCTGCCTGG 
GCTGAGGGCT CGCTCCAGGG CTTTGCAQAC TGGACCCTTA CCGGTGGCTC TTCCTGCCTG 
GGACXXrrCCC GCAGAGTCCC ACTAGCCA6C GQCCTCAGGC AGGGACGAAG GCCTCAAAGC 
TGAGAGGCCC CTAC06GTGG GTGATGGATA TCATCCCCGA ACAGGTGAAG GGACAACTGA 
CTAGCAGCCC CAGAGCCCTC ACCCTGCGQA TCCCAGCCTA AAAGACACCA GAGACCTCAG 
CTATGGAGCC CTTCGGACCC ACTTCTCACA GACTCTGGCA CTGGCCAGGC CTCGAACCTO 
G6ACCCCTCC TCTGATGAAC ACTACAGTG6 CTGAGGCATC AGCCCCOGCC CAG6CCCTGT 
AGGGACAGCA TTTGAAGOAC ACATATTGCA GTTGCTTGGT TGAAAGTGCC TGTGCTGGAA 
CT6GCCTSTA CTCACTCATG GGAGCTGGCC OC 

Seq ID NOt 377 Protein sequence t 
Protein Accession #: NP 003967.1 

1 11 21 31 41 SI 

I ' I I I I. i 

MELGLGGLST LSKCPHPRRQ PALWPTIiAAL ALLSSVABAS IjGSAPRSPAP REGPPPVLAS 
FAimLPGQRT ARHCSGBARR PPPQPSRPAP PPPAPPSALP RGC^IAARAGG PGSRARAAGA 
RGCRLRSQLV PVRALGLGHR SDELVRFRFC SGSGRSARSP HDLSZiASLLG AGALRPPPGS 
RPVSQPCCRP TRYEAVSFMD VNSTWRTVDR LSATAOKIiG 

Seq ID NO: 278 DNA sequence 

Nucleic Acid Accession ftt NH_057160.1 

Coding sequence: 1-714 



1 
I 

ATGCmSGCC 
CAGCTGGGTG 
TGGCXrCACCC 
GCQCCCCXSCA 
CACCTGCCGG 
CAGCCTTCTC 
CGCGCGGC6C 
CGCCTGCQCT 
CT6GT6GGTT 
AGCCTGGCCA 
AGCCAGCCCT 
ACCTGGAGAA 
6CTCCAGGGC 
CAQAGTCCCA 
TACCX3GTGG6 
AGAGCCCTCA 
TTCGGACCCA 
CTGATGAACA 
rUGAAOSACA 
TCACTCAT66 



11 
I 

TGATCTCAGC 
CCCTCTTTCT 

TGGCOGCTCT 
GCCCTGCCCC 
GGGGACGCAC 
GGCCCGC6CC 
G66CTGGGG3 
06CAGCTGGT 
TCOGCTTCTG 
GCCTACTGGG 
GCTGCCGACC 
COGTGGACCG 
TTTGCA6ACT 
CTAGOCAGOQ 
TGATGGATAT 
CCCTGCGGAT 
CTTCTCACAG 
CTACAGTGGC 
CATATTOCAG 
6AGCTG6CCC 



21 
I 

CCX3AGGACAG 
CCCTGAGGCT 
GGCTCTGCTG 
CCGC6AAGGC 
GGCCGGCT6G 
CCCXSCOGCCT 
CCOGGGCAGC 
GCCGGTGG6C 
CAGCGGCTCC 
GGCCGGGGCC 
CACGCGCTAC 
CCTCTCCGCC 
6GACCCTTAC 
GCCTCAQCCA 
CATCCCCGAA 
CCCAGCCTAA 
ACTCTX3GCAC 
TGAGGCATCA 
TTGCTT6GTT 
C 



T6CCGCCGCG 
CTGCGACCGC 
GAAGCGGTCT 
ACO6CCTG0Q 
OQGTGGCTCT 
GGGAOQAAGG 
CAG6T6AAGG 
AAGACACCAG 
TG6CCAGGCC 

GAAAGIGCCT 



CCTCAAAGCT 
GACAACTGAC 
AGACCTCAGC 
TOGAACCTGG i 
AGGCCCTGTA i 
GTGCIGGAAC ' 



Seq ID NO: 279 Protein sequence: 
Protein Accession NP_476501.1 

11 



51 



21 31 41 

I I 1.1 I I 

MPGLISARGQ PLLSVLPPQA HLGALFLPEA PLGLSAQPAL WPTLAALAIJL SSVAEASLGS 
APRSPAPREG PPPVLASPAG HLPGGRTARW CSGRARRPPP QPSRPAPPPP APPSAItPRGG 
RAARAGGPG8 RARAAGARGC RLRSQI*VFVR ALGUSHRSDB IiVRFRFCSGS CRRARSPHDL 
SIiASIiL^AGA LRPPF6SRPV 5QPCCRPTRY BAVSFMDVNS THRTVDRLSA TACGCLG 

Seq ID NO: 280 DNA sequence 

Nucleic Acid Accession #: NM__057090.1 

Coding sequence: 29-715 



CTGATGGGCG 
GTCCCACTGC 
GTGGCCCACC 
CGCGCCCC6C 
CCACCTGCCG 
GCAGCCTTCT 
CCGCGCGGOG 
CCGCCTGCGC 
GCTGGTGOGT 
CAGCCXGGCC 
CAGCCAGCCC 



11 

I 

CTCCTGGTGT 
CCCTQGCCTA 
CTGGCCGCTC 
AGCCCTQCCC 
GGGGGACGCA 
CGQCCGGCGC 
CGGGCTGGGG 
TCGCAGCTGG 
TTCOGCTTCT 
AQCCTACTGG 
TGCT60CGAC 



21 
I 

T6ATAGA6AT 
GGCGGCAGGC 
TGGCTCTGCT 
CCCGCGAAG6 



CCCCGCCGCC 
GCCCGGGCAG 
TGCOGGTGCG 
GCAGC6GCTC 

CCAOGOGCTA 



31 
I 

GGAACTTGGA 
TCCACTTGGT 
GAGCAGCGTC 
CCCCCCGCCT 
6T6CAGTGGA 
TGCACCCCCA 
COGCGCTOGG 
CGCGCTCGGC 
CTGCC6C0GC 
GCTGGGACGG 
COAAGCGGTC 



41 
I 

CTTGGAGGCC 
CTCTCCGCGC 
GCAGAGGCCT 
GTCCTGGCGT 
AGAGCCCOGC 
TCTGCTCTTC 
GCAGCGGGGG 
CTQGGCCACC 
GOGCGCTCTC 
0GC0C3QGGCT 
TCCTTCATGG 



51 

I 

TCTCCACGCT 
AGCCTGCCCT 
CCCTGGGCTC 
CCCCCGCQGG 
GGCCGOCGGC 
CCC6GGGGG6 
CGCGGGGCTG 
GCTCCGACGA 
CACACGACCT 
0COQ6CCCGT 
ACGTCAACAG 



600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 



60 
120 
180 
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60 
120 
180 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 



291 



wo 02/086443 

CAGCTGOAGA AC06TGGACC GOCTCTCOOC CACOGCCTGC GGCTGCCTGG GCTGAG66CT 720 

cxscrrccaGGG ctttgcagac tgqaccctta ccggtggctc ttcctgcctg GGAccxrrccc 790 

GCAGAQTCCC ACTAGCCAGC GGCCTCAGCC AGGGA0C3AAG GCCTCAAAGC TGAGAGGCCC 840 

CTAC0Q6TGG GTGATGGATA TCATCCCOGA ACAGGtGAAG GGACAACTGA CTAGCA6CCC 900 

CAGAGCCCTC ACCCT60GGA TOCCAGCCTA AAAGAGACCA GAGACCTCAG CTATGGAGCC 960 

CTTC6GACCC ACTTCTCACA GACTCTQGCA CTGGCCAQOC CTCQAACCTO GOACCCXnCC 1020 

TCTGATGAAC ACTACAGTGG CT6AGGCATC AGCC0C06CC CA06CCCTGT AGQGACAOCA 1060 

TTTGAAGGAC ACATATTGCA GTTGCTTGGT TGAAA{3TGCC TOTQCTGGAA CTOQCCIGTA 1140 
CTCACTCATG G6AGCTGGCC CC 

8eq ID NOt 281 Protein sequence i 
Protein Accession »i MP_47 6431.1 • 

1 11 21 31 41 51 

I I I I I I 

MELSLGGliST IiSBCFHFRRQ APL6LSAQPA XjMPTLAALAI* LSSVAEASUS SAPRSPAPRE 60 

GPPPVLA5PA GHLPG6RTAR WCSGRARRPP PQPSRPAPPP PAPPSALPRG GRAARAGGPG 120 

SRARAAGARG CRLRSQtVPV RALGtiGHRSD ELVRPRPC8G BCRRARSPHD IiSLASLLGAG 180 
ALRPPPGSRP VSQPCCRPTR YEAVSF14DVN 8TWRTVDRLS ATACX3CLG 

Seq ID KO: 282 DNA sequence 

Nucleic Acid Accession #; Eos sequence 

1 11 21 31 41 51 

I I i I I I 

CTACTGCACC TGCCCTCTGT TTCCTTTGGA AATCTCTTAC CTTTCATTAG GGTTTCTTTC 60 

ATAGCAATTT CCTTTQQTTT TTAAGACTTC TACATTGCTT TTTCTTTTAT TATCTGTGCT 120 

CaOTGAACCr TATGAATGCT GCTTAAAAAT AATGTCAAAA TATCmTAG CTGCCTA CTC 160 

AQOTAACGTT TTCTTTTQCT CTCATCTTGG TTTCCATATA CTATTTTTGO TTTOTTGTGA 240 

GATCTAATCA ATQATCTAGT CA6AAGCTAC TTCACTGGCT AACftGTGATC ATGTTCATGT 300 

GCTAAAAATG AACTTGAAAC ACXK3AAGTAG TGGTTGGTCC AGTTTGAAAG CTCTTATTAG 3 60 

TATTCTTCAT CCTGGCTGTA ATAATAGCCA TTATTTGTTA TGCCTT TGTT ATQTA GCAG A 420 

CACTCTTAAG GATTTTATGT GTATTATTCA AATTQCTATT ACTOTTCTTT TTATAGTTGA 480 
GAATCTCAGG ATAOCTACAT TTATCACTTT TTCAATATAT ATOTATTTCT TATT 

8eq ID KO: 283 DNA sequence 

Nucleic Acid Accession #: Eos sequence 

Coding seqaencet 564-1481 

I 11 21 31 41 51 

II j I 1 I 

GAGACTTTTA ATCATCTATC CCTTGTGCTT TAOGCAGACC CTACAATACA CTAGAGGCTT 60 

CAAAGAGGTC AAAAATTCAC AT6TGTAGAC AAATTAGGTC CCTTAAOATG GCAGGCAAAC 120 

6AAGTGCTAC CAAAACACX^C AAT6ACTGTC CTAAAAGTGC GTTCTGG6AT ACACCTGTAA 180 

ACTTGGATCA AGTTCCCTCC CCTCTCCTCA AAATATATCG ACTTGTGCTG AAAG AAATCA 240 

OGACOGATGC TCACAATTCT GACCTCGTAA TTATATAQGG GQTGGTTTTG GTTTCTGCOT 300 

CTTTCCCTGA TTCAGTGGCA GGTAACATAT TTCA7GTACA AAATGAACTG CAACACCACG 360 

GGAAACAAGG GACAOGCCCT CAAAGTTGTC GGTAOQGAaC CAGGACOOGG CCAGTGGOQT 420 

GGGGAGACAC 06TACTAAAC AAGCTTGC3UI ACAGCAGGCA CCTT0CT6CC ACTGAGGAG6 480 

AAGG6CTGQC TAAGGGAGGC C3GGGGOG6AQ GAAGCCAAGC TCTGCAQGCC CTGACAAAGT 540 

CCTCCCGGCC TCCACOCGTC QCCATQGCAA CGCGGGGTCT GTGCTGGCCG GGAT TGGC OG 600 

GCCTCaCGCG 06CAGGGCCC GCTG6GAAAG OGCGTCCCOG COSOGGCTCC GCCaGTTTGA 660 

ACTT6G0G66 CCAGATOTGG GC3GG0GGGGC GCTGGGGGCC TACTTTTOCC TCTTCCTAOG 720 

CQGGTTTCTC TCCTGACTGC AGACCCAfiGT CTOGGCCCTC CTOGGACTCC TGCTCAGTCC 780 

CTATGACGGG CX3CACGTGGG CAGGGGCTGG AGGTGGTGOO CTCGCOGTOS CCGCCGCTGC 840 

OQCTGAGCTG CAGCAATTCC ACCAGGTG6C TGTTGTCTCC CCTTGGCCAC CAGAGCTTCC 900 

AGTTTGACX3A G6A0GACGGT GAOGGGGAGG ATGAG6AAGA CGTGGAtGAT GAGGAAGA06 960 

TGGATGAAGA TGCOCATOAT TCAOAGGCCA AA6TGQ06AG CCTGAGAaGA ATGGAGTTAC 1020 

AGGGGTQCXSC CAOCACTCAG GTTGAATCAG AAAATAACCA AGAAGAACAG AAACAGGTGC 1080 

GCTTACCAGA AAGCCGCCTG ACACCATGGG AGGTGTGGTT TATTGGCAAA GAAAAAGAAO 1140 

AACXSTGACCG GCTGCAACTG AAAGCTCTAG AGGAATTAAA TCAACAACTA GAAAAAAGAA 1200 

AA6AAATGGA AGAAGC3TGAA AAAAGAAAGA TAATTGCTGA AGAAAAGCAC AAGGAATGGG 1260 

TTCAGAAAAA GAAOGAGCAA AAAAGAAAAG AAAGA6AACA AAAAATTAAT AAA6AAATG0 1320 

AGGAAAAAGC A6C3UUIGGAA CTGGA6AAAG AATACTTGCA AGAAAAAGCA AAAGAAAAAT 1380 

ATCAAGAATG GTTAAAQAAA AAAAATGCTG AAGAATGTGA GAG6AAGAAG AAAQAAAAGA 1440 

AAAACAACAG CAAGCTGAAA TACA6GA6AA AAAGGAAATA GCAQAAAAAA AGTTrCAAGA 1500 

ATGGTTGGAA AAT60GAAAC ATAAACCTOG TCCAGCTGCA AAGAGCTATG GTTATGCCAA 1560 

TGGAAAACTT ACAGGTTTTT ACAGTGGAAA TTCCTATCCA GAACCAGCCT TTTATAATCC 1620 

AATTOCOTGG AAACCAATTC ATATGCCACC TCCCAAAQAA GCTAAGGATC TATCAGGAAG 1680 

GAAGAGTAAA AGACCT6TGA TAAGTCAOCC ACACAAGTCA TCATCTCTGG TAATTCATAA 1740 

AGCCAGGAGC AATCTTTGCC TTGGAACTCT GTGCAGAATA CAAAGATAGC GTATGTGGAA 1800 

AATAACATGC TTTTATCTG6 AGCTATTTAA TTTAAAAATC AQAAATTGTT TTTTACTGCT I860 

CAGTGAATAA CTCAACACTT AATGTGATTA TTGACAAATA GCAATTTrTG CATTTGTATA 1920 

TGOAGTCCTT AOAGrTGAGG AA6ATATTTT CTGGATTTTG GTTTTTATAA ACTTTTTAAG 1980 

GTTGATCTT6 GCATGTTGTT TTGCAGAATA AGTGGCTGAA TATGTAAGAA TTGTGrTTGT 2040 
ATTTA6CTTG TATTAAAAOT ACACTGTAAT AOCAATAAAA CTAACAAITT TTCTT6 

Seq ID NO I 284 Protein sequence: 
Protein Accession Bos sequence 

1 11 21 31 41 51 

111)11 

MATR6LCWPG LAGLARAGPA GKARPRROSA SLNLAGQMWA AGRWGPTFPS SYAGFSADCR 60 

PRSRPSSDSC SVPMTGARGQ GLEWRSPSP PLPLSCSMST RSLLSPLGHQ SFQFDEDDGD 120 

GBDEEDVDDE EDVDEDAKDS EAKVASLRGM ELQGCASTQV B6ENNQEEQK QVRLPESRLT 180 

PWSVWPIGXB KSESDRLQUC ALEELNQQLB KRKEHBBRBK RKIIAEEKHK BHVQKXNEQK 240 

RKEREQKINK BHEEKAAKEL EKEYLQEKAK BKYQEHIiKXK NABECERKKK EKKHNSKLKY 300 



292 



wo 02/086443 

RRKRX 



Seq ID NOt 285 DHA Begisence 

Nucleic Acid Accession ft: eob sequence 

Coding sequence: 1-1746 

1 11 21 31 41 51 

I I i I I I 

ATGCCACTGA AGCATTATCT CCTTTTGCTG GTGGGCTGCC AAGCCTGGGG TGCAGGGTTG 60 

GCCTACCATG GCTGCCCTAG CGAGTGTACC T6CTCCAGGG CCTCCCAGGT GGAGTGCACC 120 

G0G6CAC6CA TTGTGGCOGT GOCCACCCCT CTGCCCTGGA ACGCCATQAG CCTGCAGATC 180 

CTCAACAOGC ACATCACTGA ACTCAATGAG TCCCCGTTCC TCAATATCTC AGCCCTCATC 240 

GCCCTGAGOA TTGAGAAGAA TGAOCTGTCG CGCATCACGC CTGGGGCCTT CCGAAACCTG 300 

GGCTCGCTGC QCTATCTCAO CCTCX3CCAAC AACAAGCTGC AGGTTCTGCC CATCGGCCTC 360 

TTCXaWSGGCC TGGACAGCCT TGAGTCTCTC CTTCTGTCCA 6TAACCAGCT GTTGCAGATC 420 

CAGCCGGCOC ACTTCTCCCA GTGCAGCAAC CTCAAGGA6C TGCA6TTGCA CX3GCAACCAC 480 

CTGGAATACA TCCCTGACGG AGCCTTCGAC CAOCTGGTAG GACTGAOQAA GCTCAATCTG 540 

GGCAAGAATA GCCTCACCCA CATCTCACCC AOGGTCTTCC AGCACCTGGG CAATCTCCAG 600 

GTCCTCCGGC TGTATGAGAA CAGGCTCACG GATATCCCCA TGGGCACTTT TGATGGGCTT 660 

GTTAACCTGC AGGAACT6GC TCTACAGCAG AACCAGATTG GACTGCTCTC CCCTGGTCTC 720 

TTCCACAACA ACCACAACCT CCAGAGACTC TACCTGTCCA ACAACCACAT CTCCCAGCTO 780 

CCACCCAGCA TCTTCATGCA GCTGCCCCAO CTCAACCGTC TTACTCTCTT TGGGAATTCC 840 

CTGAAGGAGC TCTCTCTGGG GATCTTOQGG CCCATGCCCA ACCTGCGGGA GCTTTGGCTC 900 

TATGACAACG ACATCTCTTC TCTACCOGAC AATGTCTTCA GCAACCTCCG CCAGTTGCAO 960 

GTCCTGATTC TTAOCCOCAA TCAOATCAaC TTCATCTCCC CG6GTGCCTT CAAGGGGCIA 1020 

ACGOAQCTTC GGGAGCTGTC CCTCCACACC AACGOVCTGC AGGACCTGGA COGOAATGTC 1080 

TTCCGCATGT TGGCCAACCT GCAGAACATC TCCCTGCAGA ACAATCGCCT CAGACAGCTC 1140 

CCAGGGAATA TCTTCGCCAA CGTCAATGGC CTCATGGCCA TCCAGCTGCA GAACAACCAG 1200 

CTGOAGAACT T6CCCCTCG6 CATCTTCGAT CACCTGGGGA AACTGTGTQA 6CTGCGGCTG 1260 

TATGACAATC CCTGGAGGTG TGACTCAOAC ATCCTTCGGC TCGGCAACTG GCTCX^IGCTC 1320 

AACX:A6CCTA GGTTAGGGAC GGACACTGTA CCTGTGTGTT TCAGCCCAGC CAAT6TCCGA 1360 

GGCCAQTCCC TCATTATCAT CAATGTCAAC GTTGCTGTTC CAAGCGTCCA TGTCCCTGAG 1440 

GTGCCTAGTT ACCCAGAAAC ACCATQGTAC CCAGACACAC CCAQTTACCC TGACACCACA 1500 

TCCX3TCTCTT CTACCACTGA GCTAACCAGC CXn'GTGGAAG ACTACACTGA TCTGACTACC 1560 

ATTCAGGTCA CTGATGACCG CAfiCGTTTGO GOCATGAGCC AG6C0CAGAG GGGGCTGGCC 1620 

ATTGCCGCCA TTGTAATTGG CATTGTCX3CC CTGGCCTGCT CCCTG6CTGC CTGCGTCGGC 1680 

TGTTGCTGCT GCAAGAAGAG GAGCCAAGCT QTCCTGATGC AGATGAAGGC ACCCAATGAG 1740 

TGTTAAAGAG GCAOGCTGGA GCAGGGCTGG GGAATGATGG GACTGGAGGA CCTGGGAATT 1800 

TCATCTTTCT GCCTCCACCC CTGGGTCX^T GGAGCTTTCC 0GTGATT6CT CTTTCTGGCC 1860 

CXASATAAAO GTOTOCCTAC CTGTTGCTOA CTTGOCTGAT TCTCCGGTAO AOAAOCAGGT 1920 

C6TGC0GGAC CTTCCTACAA TCA66AAGAT AGATCCAACT GGCCATG8CA AAAGCOCTGG 1980 

GGATTTCOOA TTCATACXXX: TGGGCTTCXrr TCGAGAGGGC TCTTCCTCCA AATCCTCXCC 2040 

ACCTGTCCTC CAAGAACAGC CTTCCCTGCG CCCAQGCCCC CTCCGGGCCT CTGTAGACTC 2100 

AGTTAGTCCA CAGCCT6CTC ACTTCGTGGG AATAGTTCTC C6CTGAGATA GCCCCTCTCG 2160 

CCTAAGTATT ATOTAAGTTG ATTTCCCTTC TTTTGTTTCT CTTGTTTGTG CTATOOCTTG 2220 

ACCCAGCATO TCCCCTCAAA TGAAAGTTCT CCCCTTQATT TTCTGCTCCT GAA66CAGG6 2280 

TGAGTTCTCT CCTC3UUIGAA GACTTCAAAC CATTTAACTG GTTTCTTAAG AGCCGTCAAT 2340 

CAGCCTGGTT TTGGGGATGC TATGAAAGAG AQAAGGAAAA TCATGCCGCT CAGTTCCTGG 2400 

AGACAGAAGA GCC3GTCATCA GTQTCTCACT TGTGATTTTT ATCTGGAAAA GGAAGAAACA 2460 

CCCCAGCaCA GCAAGCTCA6 CETTTTAGaWS AAGGATATTT CCAAACTGCA AACTTTGCTT 2520 

TGAAAAOTTT AGCCX7PTTAA QGAATOAAAT CATGTAGAAT TTTGGACTTC TAAAAACATT 2580 

AAAATCAGCT TATTAATA06 GGATA6AGAA A6AAATCTGG TGCCTGGGGG TCCCTGTGTT 2640 

CAOXrCTAGA GTTTGTTTTA AAATTTTTAA TTGAAGCATG TGAAGTOTAC STGCAGAAAA 2700 

GTGOGAACAT GATAGTOTAT GGCTTGGTGG ATTTTCACAA ACTGAACATA CCTGTGTAAT 2760 

CAfiCATCTAG ACCCAGACXX: AGAGCATCAC AAATATCCCC CATCCTGGGC TTTTCCCAGA 2820 

GGAGATGGGG GCTTCTGAAG ATGGACTTAC CTGGGACCTG CCCCCCATGA GCCAGGACGG 2880 

TCCCCCCACA GTCAGCCTGT GCAAAGGCCC CGTGGCCAGG GGTGGAGGAG AATATGTGGG 2940 

TGTGGACAGG ATGGGAGACT GTGGCCTGAA CAGGAGATTT TATTATATCT G6AGACCCT6 3000 

AGAGACCCTG AGACCTGGGG CACCATGGCT GGCX!AGGTCA GAAGCATCCT GACTGCAGAO 3060 

6TCXX3TGCAG CCACACCCTC TTCCCTGCCA GCAAGTTGTC TGOGGCTCAT CGGAGGCOCC 3120 

TCCGCCTGGA GCCTTCTATG GACGTGATAT GCCTGTATCT GTTTTTAATT TTCATTCTTC 3180 

ACTTAGGGGA AGTGAAAT06 CTCAGAGATG AGATCCTTTA ATTGAAAACG AA6TGTAA0G 3240 

GAATCTAGTG TCTTTCTAAT GXGGTAAAAT TCTCXIATCAA CATCAC3U3TC AGCTGGCAGC 3300 

TGAACTTCAG AATCTCACTT ACAGCAGGCG ACACGGGGCT ACACG5ATGG GTCACACTGG 3360 

GTCTGGGGGC TCCCtbGAGC TCCTOCTGCG TGTGGTCTGG TTAGGA6TTG AGTTGTTTGC 3420 

TCCAGGGTTA TTCTCCTCCT CGAGTCACAG TCACAC3GAAT ACCTGCCTTC TCTGGCTTTC 3480 

CT6CTATACA CATATTCACA T6GGGCTCAA GAA6TTAGGC TCATGGCAAC GTGTGTCTTT 3540 

CTCIGGACAA CTGGCCCAGT TTACAGTGAA ATG6AGAATT TCAGGTCTCC ACGTCCGCX:C 3600 

AGGAAAGAAC TTCAGCTGAC TCCACGGGGA TCTGGAAATC CACQACCAAT CCCGATCGGC 3660 

TCTTATTAGC TCCCCGCTCC ACAAGACACC TGTGCTTTGG AAATCCACC3S. CCAATCCOGA 3720 

T06GCTCTTA TTAGCTCCCC GCTCCACAAG ACACCTGTGA TCTGGAAATC TACCACCAAT 3780 

OCOGATOGGC TCTTATTAGC TCCCCGCTCC ACAAGACACC T6TGACATCC TCCAGG6CCA 3840 

CAGGAGCACG TGCTGACCAG TTTTCCCTTC CAGTTCCTGC ACAAAAA6TG TCGAGAGGGC 3900 

TGTTTGCAAA CACTAGTGCA CTTTGTAGCT TTTCACOCTC TGTCCCAGGG AATCTAGGAO 3960 

AGATGAGGCC OGTCAGAGTC AAGAGATGTC ATCCCCCCAG GGTCTCCAA6 GCATTTCCAC 4020 

ACTATTGGTG GCACCTGGAG GACATGCACC AAGGCTTGCC AGAGCCAACA G6AAGTGAGC 4080 

CCAGA6CATG GCACATQAGC ATCACCCGCT GATGGTGGCC TGCTGT6CCT GGTGCCAACA 4140 

GGGGCATCCC GOCCCGTACC CCTCCAGACA GQAAGCATOS 6TTTGCCCAC AQACCTGTCG 4200 

GGTGCTCCTQ TGAGTGGCCT CCAGATGTCT TTGTGCATAG OCACAAQTGG GCCAGGGCTG 4260 

GAGGGAGGTG GGAAACCTCA TCATCCGGTG GGCCCTGCCA ATCTTAACCC AGAACCCTTA 4320 

GGTATTCCTG GCAGTA6CCA TGACATTGGA GCACCTTCCT CTCCAGCCAG AOGCTGACCT 4380 

GAGGGCCACT GTCCTCAGAT GACACCACCC A6GAGCACCC TAGGT6AGGG GTGA66GCCC 4440 

CCTTATGTGR ACCTCTTGCC TCTTCCTTTC TCCCATCAGA GTGGTTG6AT GGAGCCATTG 4500 

GOCTCCTTTT CTTCA6CGGG CCCTTCAACC TCTCTGCACC ATGTTGTCTG GCTGAG6AGC 4560 

TACTAGAAAA GCTGAGTGGA GTCTCCTTTC CAACAGGATG ATGCATTTGC TCAATt CTCA 4620 

GG6CTGGAAT GAGCCGGCTG QTCCCCCAGA AAGCTGGAGT GGGGTACAGA GTTCAGTTTT 4660 

CCTCTCT6TT TACAGCTCCT TGACA6TCCC ACGCCCATCT GGAGTGGGAG CTGGGAGTTA 4740 
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GTGTT6GAGA AOAM^CAACA AAAGCCAATT AGAACCACTA 
TGCACAGATA CTCTTCAAGC ACTGGAOGTG 6ATTCTCTCT 
GGTA6GAGTG CCGCCTCTAC CCACTTGTGA TGGG6TACA0 
GGTGTTCAAT AGGCTGGGAG TTTTATTTAT CTCTTCAAAC 
TTGTCTTGGG CTTTCGTCAT TAAACCAAAG GAAATGGAAG 
TTAGTCTTGG TCATCAGAAC CTCACXTGGT ACCATATAGA 
GGAAAAAATA AACTCTTCCA TCCCTTAAAO AATAGAATAG 
TGGGCTGTAT GTATATTGTT CTTCCTCCTT AGAATTTAGA 
AACTTTTCAT GGACACAATT TCCAC3ACCT TTCAGATGCT 
GAACTTCCAA ACTCAGGAAG TTTGCAOAGA GCAGACAGCT 
AGTTGGTCGA CAGATGTTAG ATGTATCCTA GCTTTTAQCC 
GCCCCCAGAT CGCACAGTCA GAACTGAATC TGOGTTGTTG 
GQAAGGAAGC CATG6CT6TG GTTCAGAGAG 6GT6GGCTGG 
CTCCTTCCGC CCCAGGTTTC TTCTTCTCTT AAGGAGAGAT 
CTTCATGCTG CCTTCAAAGC TAGATCATGT TTGCCTTGCT 
GCCCCAGT6C TTGGCXSATGC ATTTACAGAT TTCTAGGCCC 
AGC0CT6GTG GGGAGGGTTG GGGGGTCT6T CTTCTGCTGG 
GGItSTACAGA ATCAACAATA AATAATATAC ATOTAT 



PCTAJS02/12476 



Seq ID NO: 286 Protein sequences 
Protein Accession #» NP_S70843.1 



1 
I 

MPLXHYLLLL 
UTTHITELNE 
PQGLDSIjESL 
QKNSLTHISP 
FBNNHKLQRL 
YDNHISSLPD 
FBMLANLQNI 
YDNPHRCjDSD 
VPSypETPWY 
lAAIVIGIVA 



11 

1 

VGCQAWGAGL 
SPPLHISALI 
LLSSNQLLQI 
RVPQBLGHLQ 
YliSZlNHISQL 
NVFSNLRQLQ 
SLQNHRLRQL 
IIiPLRNWLIiL 
PDTPSYPDTT 
LACSLAACVG 



21 
I 

AYHGCPSECT 
ALRZEKNELS 
QPAHF5QCSN 
VLRLYQIRLT 
PPSZFMQLPQ 
VLILSRNQIS 
PGETIFANVNG 
NQPRLGTDTV 
SVSSTTELTS 
CCCCKKRSQA 



31 
I 

CSRA8QVSCT 
RZTP6AFRNL 
IiKEIiQLHGNB 
DIPMGTFD6L 
UmLTLFGHS 
FZSPGAFNGL 
LMAIQIiQNNQ 
PVCFSPANVR 
PVEDYTDLTT 
VLHQNKAFME 



8eq ZD NOt 267 DNA sequance 
Nucleic Acid Accession ft) NN_002362 
Coding sequence t 1 . . 954 



ATGTCTTCTG 
GAGGCCCTGG 
TCCTCCTCCT 
OGTCCTCCGC 
TGGAGGCAAC 
GA06CAGAGT 
CTGCTCOGCA 
ATCAAAAATT 
ATOATCTTTO 
ACCXGCCTGG 
GGCCTTCTGA 
GAAATCTGGG 
GGGGAGCCCA 
CAGGTACCOS 
QAAACCAGCT 
6CCTACCCAT 



11 
1 

AGCAGAAGA6 
GCCTGGTGGG 
CTOCTCTGGT 
AGAGTCCTCA 
CCAATGAGGG 
CCTTGTTCCG 
AGTATOGAGC 
ACAAG06CTG 
GCATTGACGT 
GGCTTTCCTA 
TAATCGTCCT 
AGGAGCTGGG 
GGAAACTGCT 
GCAGTAATCC 
ATGTOAAAGT 
CCCTGGGTQA 



21 

I 

TCAGCACTGC 
TGCACAGGCT 
CCCTGGCACC 
GOQAOCCTCT 
TTCCAGCAOC 
AGAAGCACTC 
CAAGGAOCTO 
CTTTCCTGTG 
GAAGGAAGTG 
TGATGGCCTG 
GGGCACAATT 
TGTGATGGGG 
CACCCAAGAT 
TGGGCGCTAT 
CCTG6AGCAT 
AGCAGCTTTG 



31 
I 

AAGCCTGAGG 
CCTACTACTG 
CTGGAGGAAG 
QCCTTACCCA 
CAAGAAQAGO 
AGTAACAAGG 
GTCACAAAGG 
ATCTTGG6CA 
6AGCCCX3CCA 
CTGGGTAATA 
GCAATGGAGG 
GTGTATGATO 
TGGGTGCAGG 
GAGTTCCTGT 
GTGQTCAQGO 
TTAOAGGAQG 



TTTTTAAAAA 
CTA6CCCTCA 
AGGCACTTGC 
TTTGTACAAO 

ccATTccxxrr 

TCAAAAGCTT 
TTTOTCCCTC 
GATACAAGAG 
GAT6TAGAGC 
AGAGATAACT 
ATAAACCACT 
GGAAGCCAGC 
CAAGCCACTT 
TGTTCTCACC 
TAGAGAATTA 
TCAGGGTTTT 
ATQCTGCTTG 



41 

I 

GARZVAVPTP 
GSLRYLSLAN 
LEYIPDGAFD 
VHLQELALQQ 
LKBLSIiGIFG 
TEIjRBZiSIjHT 
LBHLFZjaiFD 
GQSLIZZNVN 
IQVIDDRSVW 
C 



41 
I 

AAGGOGTTGA 
AGGAGCAGGA 
TGCCTGCTGC 
CTACCATCAG 
AGGGGCCAAG 
TGGATGAGTT 
CAGAAATGCT 
AAGCCTCCGA 
GCAAGACCTA 
ATCAGATCT7 
GC30ACAGCGC 
OGAGGGAGCA 
AAAACTACCT 
GGGQTCCAAG 
TCAATGCAAG 
AAGAGGGAGT 



OTGCTTACTG 
GCACCCCTGC 
TCTTCTQCAT 
AGCTCATGQC 
QTT6CTCTCC 
TGTAACCACA 
TCATGG6AAT 
TTCTACTTAG 
TATTGGGAAA 
CGGGACCCAG 
CAAAGATTCA 
AGTGGCCTTG 
CCGGGGAAAA 
AACCOGCTGC 
CTGCAAATCA 
GTAGAGTGTO 
TAATCCATTT 



51 
I 

IiPWHAMSLQZ 
NXLQVLPIGL 
HLVGLTKLKL 
NQIGLLSPGL 
PMFMLRSLWL 
NALQDLDGNV 
HLQKLCBLRIt 
VAVPSVHVPE 
GNTQAQSGLA 



51 
I 

GGCCCAAGAA 
GGCTGCTGTC 
TGAGTCAGCA 
CTTCACTTGC 
CACCT06CCT 
GGCTCATTTT 
GGAGAGAGTC 
GTCCCTGAAG 
CACCCTTGTC 
TCCCAAGACA 
CTCTGAGGAG 
CACTGTCTAT 
GGAGTACCGG 
G6CTCT66CT 
AGTTGGCATT 
CTGA 



Seq ZD NOt 288 Protein sequence: 
Protein Accession ft: NP_002353.1 



GPPQSPQGA6 
LIiRKYRAKBIj 
TCLGLSYDGL 
GEPRKLLTQD 
AYPSLREAAIj 



11 21 31 41 51 

I I i I I 

KPEEGVEAQE EALGLVQAQA PTTEEQEAAV SSSSPLVPGT LEEVPAAESA 
ALPTTISFTC HRQPNB6SSS QEEEGPSTSP DABSLFREAL SNKVDELAHF 
VTKAEMIiERV IKNYKRCFFV IFGKASESItK MZFGIDVXEV DPAS^^TYTLV 
LGNNQIFPKT OLLIIVLGTI AMEGDSASEB EZWBELGVM6 VYXXSREBTVY 
WVQSNYLBYR QVPGSNPARY EFLHGPRALA ETSYVKVLBH WRVNARVRZ 
IiEEBEGV 



Seq ID NO: 289 OKA sequence 
Nucleic Acid Accession #: NM_002362 
Coding sequence : 46 1344 



1 
I 

OGGOGGCCGC 
GGCGACCTGA 
CATCAGCGCG 
CTCAACAGAC 
TTGACCAGAA 
CAGCCCaTCG 
GQCCCCAGCA 
GTTCTACCTG 
AAATCCXrATC 



11 

I 

GCCCTGGTTG 
AGCAGGCGCT 
GCAGCAGCAC 
ATAATATTQT 
ATGTGCAGTC 
ATTTGASTGC 
GTGAAAATCT 
CAGCTGAATT 
TCCTCGATTA 




TGCAAAGAAA 
GTTTGGTGAT 
TGTGTCTATT 
AT6CACTGTT 
GGAGGAAGAG 
CCATGGGCTT 
TGTGATGACA 



31 
I 

GCTCTCG6GG 
GCCGAGTCGC 
GAAGACATAA 
TACACATGGA 
ATTGACACA6 
6CACTTCACA 
ACAGAAAACA 
TGGGACAGCT 
ACTTTACTGT 



41 

I 

G06CCATGGA 
CAACGGTCCA 
ACCTGAGTGT 
CT6AGTTT6A 
AATTAAAG6T 
TTTTOCAGCT 
TAATTGCAGC 
TGGTATACGA 
TTTCAGACAA 



51 
I 

CQAGGCCGTG 
CGTGGAGGTG 
TAGAAAGCTA 
TGAACCTTTT 
TAAAGACTCA 
GAAT6AAGAT 
AAATCACTG6 
TGTGGAAGTC 
QAAG6TCAAC 



4800 
4860 
4920 
4980 
5040 
5100 
5160 
5220 
5280 
5340 
5400 
5460 
5520 
5560 
5640 
5700 
5760 



60 
120 
180 
240 
300 
360 
420 
480 
540 



60 
120 
180 
240 
300 
360 
420 
460 
540 
600 
660 
720 
780 
840 
900 



60 
120 
180 
240 
300 



60 
120 
180 
240 
300 
360 
420 
480 
540 
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A6CAACCTCA TCACCTGGAA CCGGGT66TG CTGCTCCA0C3 GTCCTCCTGG CACT66AAAA 600 

ACATCCCTOT GTAAAGCOTT AOCCCAGAAA TTGACAATTIV 6ACTTTCAAG GAGGTACOGA 660 

TATGGCC3UIT TAATTGAAAT AAACA6CCAC AGCCTCTTTT CTAAGTGGTT TTCGGAAAGT 720 

GQCAAGCTGO TAACCAAGAT GTTTCAGAAG ATTCAGGATT TGATTGAT6A TAAAGACGCC 780 

CTGGTGTTCG TGCTQATTGA TGAGGTGGAG AGTCTCACAG CCGCCCGAAA TGCCTGCAGG 840 

GCGGGCACCG A6CCATCAGA TGCCATCCGC 6TGGTCAATG CTGTCTTGAC CCAAATTGAT 900 

CAGATTAAAA G6CATTCCAA TGTTGTGATT CTGACCACTT CXAACATCAC OQAGAAOATC 960 

OACSTGGCCT TCGTGGACAG GGCTGACATC AA6CAGTACA TTG6GCCACC CTCTGCAGCA 1020 

GCCATCTTCA AAATCTACCT CTCTTGTTTG GAAGAACTGA TGAAGTGTCA GATCATATAC 1080 

CCTCGCCAGC AGCTGCTGAC CCTCC3GAQAG CTAGAGATGA TTGGCTTCAT TGAAAACAAC 1140 

GTGTCAAAAT TGAGCCTTCT TTTGAATGAC ATTTCAAG6A A6AGCGAGGG CCTCAGCG6C 1200 

CX366TCCT6A GAAAACTCCC CTTTCTGGCT CATGCQCIGT AIQTCCA6GC CCOCACOGTC 1260 

ACCATAGA6G GGTTCCTCCA GGCXXITOTCT CTQGCAGOXSG ACAAGCAGTT TQAAGAGAQA 1320 

AAGAA6CTTG CAGCTTACAT CTGATC3CTGG GCTTCCCCAT CTGGTGCTTT TCCCATGGAG 1380 

AACACACAAC CAGTAAGTGA GGTTGCCCCA CACAGCQGTC TCCCAGGGAA TCCCTTCTGC 1440 

AAACCAAACG TTACTTAGAC TGCAAGCTAG AAAGCCACCA AGGCCAGGCT TTGTTAAAAG 1500 

AAGTGTATTC TATTTATGTT GTTTTAAAAT GCATACTGAG AGACAAACAT CTTGTCATTT IS 60 

TCACTOTTTQ TAAAA6ATAA TTCAGATTGT TTGTCTCCTT OTGAAGAACC ATC6AAACCT 1620 

GTTTGTTCCC AGCCCACCCC C3WSTGGATGG GATGCATAAT GCCAGCAAGT TTTG T TTAAC 1680 

AGCAAAAAAG GAAGATTAAT GCAGGTGTTA TAGAAGCCAG AAGAGAAACT GTGTCACCCT 1740 

AAAGAAGCAT ATAATCATAG CATTAAAAAT GCACACATTA CTCCA6GTGG AAGGT6GCAA 1800 

TTGCTTTCTG ATATCAGCTC GTTTGATTTA GTGCAAAAAT GTTTTCAAGA CTATTTAATG 1860 

GATGTAAAAA AGCCTATTTC TACATTATAC CAACTQAGAA AAAAATGGTC GGTAAAGTGT 1920 

TCTTTCATAA TAAATAATCA AGACATGGTC CCATTTGCAG GAAAA6TGCA GACTCTGAGT 1980 

GTTCCAGG(A AACACATGCT GGAGATCCCT TGTAACCCGG TKEGGGOQCC CCTGCATTGC 2040 

TGGGATGTTT CTGCCCAOGG TTTTGTTTGT 6CAATAACGT TATCACATTT CTAATGAGGA 2100 

TTCACATTAA TATAATATAA AATAAATAfSG TCAGTTACTG GTCTCTTTCT GCCGAATGTT 2160 
ATGTTTTGCT TTTATCTCAC AGTAAAATAA ATATAATTAA AAA 

Seg ZD NOi 290 Protein sequence: 
Protein Accession 1 1 NP_004228 

1 11 21 31 41 51 

I I I I I I 

MDBAVGDLKQ AUPCVAESPT VHVBVBQRG5 STAKKBDZNI* SVSKLUIRHN IVFGDyTWTB 60 

POEPFLTRNV QSVSIIDTEE* K7KDSQPIDL SACTVALHZF QIHSDGPSSE NLEEBTENII 120 

AAMSNVLPAA EFHGLWDSLV YDVEVKSHLL DyVMTTIiLFS DKNVNSNLIT WNKWLLHGP 180 

PGTGXTSLCK ALAQKLTIRL SSRYRYGQLI EINSHSLFSK WFSBSGKLVT KMFQKIQDLI 240 

DDKDALVFVL IDEVESLTAA RKACRAGTEP 8DAIRVVNAV LTQIDQIKRH SNWZLTT8N 300 

ITEKIDVAFV DRADZRQYIG PPSAAAIFKI YI.SCLEEU4K OQZZYPRQQL LTLRELEMIG 360 

FZENNVSKLS LLLMDZSRKS E6L8GRVLRK LPPLASALYV OAFTVTIEGP LQALSLAVDK 420 
QFEERKKLAA YZ 

Seq ZD NO; 291 DNA sequence 

Nucleic Acid Accession #t 11M_00265B.1 

Coding sequence* 77-1372 . " 

1 11 21 31 41 51 

I I I I I 1 

GTCCCCGCAG CGCCGTCGCG CCCTCCT6CC GCAGGCCACC QAGGCC3GC0G COGTCTAGCG 60 

CXX:C3GACCTC GCCACCATGA GAGCCCTGCT GQCGCGCCTG CTTCTCTGCG TCCTGGTCGT 120 

GAGCGACTCC AAAGGCAGCA ATGAACTTCA TCAAGTTCCA TCGAACTGTG ACTGTCTAAA 180 

TGQAGGAACA TGTGTGTCCA ACAAOTACTT CTCCAACATT CACTGGTOCA ACT6CCX»AA 240 

GAAATT0G6A GGGCAGCACT 6TGAAATAGA TAAGTCAAAA ACCTGCTATG AGGGGAAT6Q 300 

TCACTTTTAC CGAGGAAAGG CCAGCACTGA CACCATGGGC CGGCCCTGOC TGCCCTGGAA 360 

CTCTGCCACT GTCCTTCAGC AAACGTACCA TGCCCACAGA TCT6ATGCTC TTCAGCTGGG 420 

CCTGGGGAAA CATAATTACT 6CAGGAA0CC AGACAACCGG AGGGGACCCT GGTGCrATGT 480 

GCAGGrGGGC CTAAAGCOSC TTGTCCAAGA 6TGCATGGTQ CATSACTQCaG CAGATGGAAA 540 

AAAGCCCTCC TCTCCTCCAG AA6AATTAAA ATTTCA6T6T G6CCAAAAGA CTCTGAGGCC 600 

OCGCTTTAAQ ATTATTGGGG GAGAATTCAC CACCATCGAG AACCAGCCCT GGTTTGCGGC 660 

CATCTACAQG AGGCACOGGG GGGGCTCTGT CACCTACGT6 TGTGGAQGCA GCCTCATCAG 720 

CCCTTGCTGG GTGATCAGC6 CCACACACTG CTTCATTGAT TACX:CAAAGA AGGAG6ACTA 780 

CATCGTCTAC CTG6GTCGCT CAAOGCTTAA CTCXAACACS CAAGGGGAGA TGAAGTTTQA 840 

GGTGGAAAAC CTCATCCTAC ACAAGGACTA CAOCQCTQAC AC6CTT6CTC ACCACAAGQA 900 

CATTGCCTTG CTGAAOATCC GTTCCiAAGGA GGGCAGGTGT GCGCAGCCAT CCCGGACTAT 960 

ACAGACCATC TGCCTGCCCT CGATGTATAA CGATCCCCAG TTTGGCACAA GCTQTQAGAT 1020 

CACTGGCTTT GGAAAAGAGA ATTCTACCGA CTATCTCTAT OCGGAGCAGC TGAAAATGAC 1080 

TGTTGTGAAQ CTGATTTCCC ACCGGGAGTG TCAGCAGCCC CACTACTA06 GCTCTGAAGT 1140 

CACCACCAAA ATGCTATGTG CTGCTGACCC CCAATGGAAA ACA6ATTCCT GCCAGG6AGA 1200 

CTCAGGGGSA CCCCTCGTCT GTTCCCTCCA AGGCOGCATG ACTTTGACT6 GAATTGTGAG 1260 

CTGGGGCGGT OGATGTGCCX: T6AAGGACAA GCCAGGCGTC TACACGAGAG TCTCACACTT 1320 

CTTACCCTGG ATCGGCAGTC ACACCAAGGA AQAGAATGGC CTGGCCCTCT GAGGGTCCCC 1380 

AGGGAGGAAA CGGGCACCAC CCGCTTTCTT 6CTGGTTGTC ATTTTTGCAG TAGA6TCATC 1440 

TCCATCAGCT GTAAGAAGAG ACTGGGAAGA TAGGCTCTGC ACAGATGGAT TTGCCTGTGG 1500 

CACCACCAGG GTGAACGACA AIAGCTTTAC CCTCACGGAT AGGCCTGG6T GCTG6CTGCC 1560 

CAGACCCTCT GOCCAGGATO OAGGGGIGGT CCT6ACTCAA CATGTTACTa AGCAGCAACT 1620 

T G T Cn T n ' C TCGACT6AAG CCTGCAG6AG TTAAAAAGGG CAGGGCATCT CCTGTGCATG 1680 

GGCTCQAAGG GAGAGCX3M3C TCCCCCGACC GGTGGGCATT TGTGAGGCXX: ATGGTTGAGA 1740 

AATGAATAAT TTCCCAATTA GGAA6TGTAA GCAGCTGAGG TCTCTTGAGG GAGCTTAGCC 1800 

AATGTGGGAG CAGCGGTTTG GGGAGCAGAG ACACTAACQA CTTCAGGGCA GGGCTCT6AT 1860 

ATTCCATGAA TGTATCAGGA AATATATATG TGTGTGTATG TTTGCACACT TGTTGTGTG6 1920 

GCTGTGAGTG TAAGTGTGAG 7AAGAGCTGG TGTCTGATTG TTAAGTCTAA ATATTTCCTT 1980 

AAACTGTGTG GACTGTGATG CCACACAGAG TGOTCTTTCT GGA6AGGTTA TAGGTCACTC 2040 

CTGGGGCCTC TTGGGTCCCC CACGTGACAG TGCCTGGGAA TGTACTTATT CTGCAGCATO 2100 

ACCTGTGACC AGCACTGTCT CAGTTTCACT TTCACATAGA TGTCCCTTTC TTGGCCAGTT 2160 

ATCCCTTCCT TTTAGCCTAG TTCATCCAAT CCTCACTGGG TGGGGTGAGO ACCACTCCTT 2220 

ACACTGAAtA TTTATATTTC ACTATTTTTA TTTATATTTT TGTAATTTTA AATAAAAQTG 2280 



295 
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ATCAATAAAA TGTGATTTTT CTGA 

Seq ID NO: 292 Protein sequence i 
Protein Accession #:NP_002649.l 

1 n 21 31 41 51 

I I I I 1 1 

mAULMCUSJi CVLWSDSKG SNBLHOfVPSN CDCUJGGTCV SNKYFSNIHW CNCPKKFGQQ 60 
HCEIDKSKTC YEGNQHFYRG KASTDTMGRP CLPWNSATVL QQTYHAHRSD ALQliGI/SKHN 120 
YCRNPDNRRR PWCYVQVGLK PLVQECMVHD CADGKKPSSP PSELKFQGGQ XTLRPRFKZX 180 
GGBFTTIQiQ PWFAAIYRRH RGGSVTYVCQ QSLISPCWVI SATHCFIDYP KKEDYIWWS 240 
RSRLNSNTQG EMKPBVENLI LHKDYSADTL AHHNDIALLK IRSKEGRCAQ PSRTIQTICL 300 
PSMWTOPQPG TSCBITGFGK BNSTDYLYPB QLKMTWKIjI SHRBCQQPHY YGSEVTTKML 360 
CAADPQWKTD SCX3GDSGGPL VCSLQGRMTL TGIVSWGRGC ALKDKPGVYT RVSHFLPWIR 420 
SHTKBEraGLA L 

Seq ZD NO: 293 DNA sequence 
Nucleic Acid Accession NM_00149B 
Coding sequence t 93.. 2006 



1 11 21 31 41 51 

I I ' I ' ' 

GGCACGAGQC TGAGTGTCC30 TCTOGCGCXrC GOAAQCGGGC GACOGCOGTC AGCCmSAGG 60 

AGGAGGAGGA GQAOSAOCnO OASGGGQCGO CGATGOOGCT GCTGTCOCAO QGCTOGCOGC 120 

TGAGCTG6GA G6AAACCAAG OQCCATGCOG ACCA06T60G G066CA06GG ATCCTGCAOT 180 

TCCTCCACAT CTACCACX5CC GTCAAGGACC GGCACAAGGA CGTTCTCAAG TOGGOCGATG 240 

AGGTOGAATA CATCTTGGTA TCTTTTGATC ATGAAAATAA AAAAGTCOGG TTGGTCCTQT 300 

CTGGGGA6AA AGTTCTTGAA ACTCTGCAAG AGAAGGGGOA AAGGACAAAC CCAAACCATC 360 

CTACGCTTTG QAGACCAGAG TATGGGAOTT ACATGATTGA AGGOACACCA GGACAGCCCT 420 

ACSGOAGQAAC AATGTOOGAG TTCAATACAG TTGAGGCCAA CATGCGAAAA OGCCGGAAGG 480 

AGGCTACTTC TATATTAGAA GAAAATCAGG CTCTTTGCAC AATAACTTCA TTTCOCAGAT 540 

TAGGCTGTCC TGGGTTCACA CTGCCCGAGG TCAAACCCAA CCCAGTGGAA GGAGGAGCTT 600 

CCAA6T0CCT CTTCTTTCCA GATGAAGCAA TAAACAAGCA CCCTOGCTTC AGTACCTTAA 660 

CAAGAAATAT CXSACATAOG AOAGGAGAAA AGGTTGTCAT CAATGTACCA ATATTTAA6G 720 

ACAAGAATAC ACCATCTCCA TTTATA6AAA CATTTACTGA GGATGATGAA GCTTCAAG6G 780 

CTTCTAAGCC GGATCATATT TACATGQATG CCATQGGATT TGGA ATGGG C AATTG CTGTC 940 

TCX31G6TGAC ATTCCAAGCC TGCAGTATAT CTGAC3GCCAG ATACCTTTAT OATCAOTTGO 900 

CTACTATCTG TCCAATTGTT ATGGCTTTGA GTGCTGCATC TCCCTTTTAC CQAGGCTATQ 960 

TQTCAQACAT TGATTQTCGC TOGGGAGTQA TTTCTGCATC TOTAOATaAT AQAACTCGGO 1020 

AGGAGOSAGG ACTGGAGCCA TTGAA6AACA ATAACTATAO 6ATCAGTAAA TCCCGATAT6 1080 

ACTCAATAQA CAGCTATTTA TCTAAGTGTG GTGAGAAATA TAATGACATC GACTTGAOGA 1140 

TAQATAAA6A GATCTACGAA CAGCTGTTGC AOQAAGGCAT TGATCATCTC CTGGCCCAGC 1200 

ATGTTGCTCA TCTCTTTATT AGA6ACCCAC TQACACTGTT TGAAGABAAA ATACACCTGG 1260 

ATGATGCTAA T6AGTCTGAC CATTTTGASA ATATTCAOTC CACAAATTGG CAGACAATGA 1320 

GATTTAAGCC CCCTCCTrCCA AACTCaGACA TTGQATGGAG AGTAGAATTT CQACCCATQG 1380 

AGGTGCAATT AACAGACTTT GAGAACTCTG CCTATGTGGT GTTTGTGGTA CTGCTCACCA 1440 

GAOTOATCCr TTCCTACAAA TTGGATTTTC TCATTCCACT GTCAAAGGTT QATGAGAACA 1500 

TGAAG6TAGC ACAGAAAAGA GATGCTGTCT TGCAGGGAAT GTTTTATTTC AGGAAAGATA 1560 

TTT6CAAAG0 TGGCAATGCA GTGGTGGATO 6TTGTGGCAA GGCCCAGAAC AGCACGGAGC 1620 

TOGCTGCAGA QGAGTACACC CTCATGAGCA TAGACACCAT CATCAATGGG AAGGAAGGTG 1680 

TGTTTCCTGG ACTGATCCCA ATTCTGAACT CTTACCTTGA AAACATGGAA GTGGATGTGG 1740 

ACACCAGATG TAGTATTCTG AACTACCTAA AGCTAATTAA GAAGAGAGCA TCTGGAGAAC 1800 

TAATGACAGT TGCCAGATGG ATGAGGGAGT TTATCGCAAA CCATCCTGAC TACAAGCAAG 1860 

ACAGTGTCAT AACTQATGAA ATOAATTATA OCCTTATTTT GAAGTGTAAC CAAATTGCAA 1920 

ATGAATTATG TGAATGCCXa GAGTTACTTG GATCAGCATT TAGGAAAGTA AAATATAGTG 1980 

GAAGTAAAAC TGACTCATCC AACTAGACAT TCTACAGAAA GAAAAATGCA TTATTOACGA 2040 

ACTGGCTACA GTACXATGCC TCTCAGCCCG T6TGTATAAT ATGAAGACCA AATGATAOAA 2100 

CTOTACTOTT TTCTGGGCCA GTGAGCCAGA AATTGATTAA GGCTTTCTTT GGTAG GTAAA 2160 

TCTAGAGTTT ATACAGTGTA CATGTACATA GTAAAGTATT TTTQATTAAC AATGTATTTT 2220 

AATAACATAT CTAAAGTCAT CATGAACTGG CTTGTACATT TTTAAATTCT TACTCTGGAG 2280 

CAACCTACTG TCTAAGCAGT TTTGTAAATG TACTQGTAAT TGTACAATAC TTGCATTCCA 2340 

GAGTTAAAAT GTTTACTGTA AATTTTTGTT CTTTTAAAGA CTAOCTQGGA OCTG ATTTAT 2400 

TGAAATTTTT CTCTTTAAAA ACATTTTCTC TCGTTAATTT TCCTTTGTCA TTTCCTTTGT 2460 

TGTCTACATT AAATCACTTG AATCCATTGA AAGTGCTTCA AGGGTAATCT TGGGTTTCTA 2S20 

GCACCTTATC TATGATGTTT CTTTTQCAAT TGGAATAATC ACTTGGTCAC CTTGCCOCAA 2580 

GCTTTCCCCT CTGAATAAAT ACCCATT6AA CTCTGAAAAA AAAAAAAAAA AAAA 



Seq ID NOx 294 Protein sequence t 
Protein Accession #: NP_001489 



1 11 21 31 41 51 

K6LLSQG6PL SWEETKRHAD HVRRHGILQP LHIYHAVKDR HKDVLKWGDE VEYMLVSPDH 60 

SKMCVRLVIiS GEKVl*ETIiQE KGBRTNPNKP TLWRPBYGSY MIBGTPGQPY GGTMSEPNTV 120 

BANNRKRRKB ATSILEENQA LCTITSFPRL GCPGFTLPEV KPNPVEGGAS KSLFFPDBAI 180 

NKRPRFSTIiT RNIRHRRGEK WINVPIPKD KNTPSPPIET PTEDDEASRA SKPDHZYMDA 240 

MGFO^GNCCL QVTFQACSIS EARYLYDQLA TICPIVMALS AASPPYRGYV SDIDCRWGVI 300 

SASVDDRTRE ERGLEPLKNN NYRISKSRYD SIDSYLSKCG EKVNDIDLTI DKEIYEQLLQ 360 

BGIDHLIiAQH VAHLFIRDPL TLPEEKIHLD DANESDHFEN IQSTNNQTMR PKPPPPNSDI 420 

QMRVEFRPHE VQLTDFENSA YWFWLliTR VILSYKLDFL IPLSKVDENM KVAQKRDAVIi 480 

QGHFYFRKDI CKGGNAVVDG 06KAQNSTEL AAEEYTUtSI DTIINGKB6V PPGLIPILNS 540 

YLENHEVDVD TRCSIUIVLK LIKKRASGEL MTVARimREF lANHPDYKQD SVITDEMNYS 600 
LILKOIQIAN ELCECPELIjG SAFRKVIOrSG SXTD88H 



296 



wo 02/086443 

Seg ID NOt 295 DNA sequence 

Nucleic Acid Acceasion #: Eos sequence 

Coding sequence: 247-816 

t 

I 11 21 31 41 51 

I i I I I I 

AGT6TTCGGC TGGGGCAGGC A OSCTG IGGC TGGCTACTTC CCTTCCTCOC ATCCCCCTTO 60 

GQCCAAACGG GATCGGTGCT TCTGQTGAGA OSOCTCCCCA TGCACATCAC TCXX3«3GTGC 120 

CCTAGGGGGC ACATTTOCCA CAACTCCCAG AG6GCAGGTT TCTAGAAAGT GCCACCAGIG 180 

GGGAG6CGCC ACAACTTCAC TGCCATTTTG TGAGGTGCCG CCGTCTCTCC TCCAGCAAGG 240 

GAAACAATGA CCGATAAAAC AGAGAAGGTG GCTGTAGATC CTGAAACTGT GTTTAAACGT 300 

CCCAGGGAAT GTGACAGTCC TTOGTATCAG AAAAGGCAGA GQATGGCCCT GTTGGCAAG6 360 

AAACAAGGAG CAGGAGACAG CCTTATTGCA GGCTCTGCXA TQTCCAAAOA AAAGAAOCTT 420 

ATGACaVGGAC ATGCTATTCC ACCCAGOCAA TTGOATTCTC AGATTGATGA CTTCACTGGT 480 

TTCAGCAAAO ATAOGATGAT QCAGAAACCT GGTAGCAATG CACCTGTGGG AGGAAACGTT S40 

ACCAGCAGTT TCTCTQQAGA TGACCTAGAA TGCAGAGAAA CAGCCTCCTC TCCCAAAAGC 600 

C3VACX5AQAAA TTAATGCTGA TATAAAACGT AAATTAGTGA AGGAACTCOS ATGCQTTQQA 660 

CAAAAATATG AAAAAATCTT CGAAATGCTT GAAGGAGTGC AAGGACCTAC TGCAGTCAGG 720 

AAGCOATTTT TTGAATCCAT CATCAAGGAA GCAGCAAGAT GTATGAGAOG AGACTTTGTT 780 

AAGC»CCTTA AGAAGAAACT GAAACQTATG ATTTGAGAAT ACTTGTCCCT GQAGQATTAr 840 

CACaCCCCAA ATGCATAATC TCGTTAATGA TTOAGGAGAG AAAAGGATCA GATTGCTGTT 900 

TTCTACAATG GAGCAGGATA TTGCTGAAGT CTCCTGGCAT ATGTTACCGA ATCAAATAGC 960 

CTTCCAGAGG CTAAGAAATT TCTGTTAGTA AAAGATGTTC TTTTTCCCAA AGCATTTTAT 1020 

TTGAAAGGAT AACTTGTGTT TTGGTTATTT TGTATTCCCA CCTGTGCTGQ TAGATATTAT 1080 
TAACCCATTA GQTAAATACT ATTACMTCG TGQTTTCTOC A 

Seq ID NOt 296 Protein sequence: 
Protein Accession #i Eos sequence 

\ r f r r r 

MTDKTEKVAV OPETVPKRPR ECDSPSYQKR QRMALLARKQ 6AGDSLIAGS AMSKEKKLMT 60 
CSiAIPPSQLD SQIDDPTGFS KDRMMQKPGS NAPVGGNVTS SPSGDDLECR ETTASSPKSQR 120 
EINADIKRKL VKELRCV6QK YEKIPEMLEG VQGPTAVRKR FFESIIKEAA RCMRRDFVKH 180 
LKKKLKRMI 

Seq ID NO: 297 DNA sequence 

Nucleic Acid Accession ft: Eos sequence 

Coding sequence: 247-815 



1 11 21 31 41 51 

] I I I I I 

A£STGTTOGGC TGGGGCftGGC AOGCTGTGGC TGGCTACTTC CCTTCCTCCC ATCCCCCTTG 60 

GGCCAAACGG GAT06GTGCT TCTGGTQAGA CGCCTCCCCA TGCACATCAC TCCCA6GTGC 120 

CCTAGGGGGC ACATTTCCCA CAACTCCCAG AGGGCAGGTT TCTAGAAAGT GCCACCAGTG 180 

GGGAGGCGCC ACAACTTCAC TGCCATTTTG TGAGGTGCCG CCGTCTCTCC TCCAGCAAGG 240 

GAAACAATGA CCGATAAAAC AGAOAAOGTO GCTGTAGATC CTOAAACTOT GTTTAAACGT 300 

CCCAGGGAAT GTGACAOTCC TTOQTATCAG AAAAGGCAGA GGATCGCCCT GTTGGCAAGG 360 

AAACAAGGAG CAGGAGACAG CCTTATTGCA GGCTCTGCCA TGTCCftAAGA AAAGAAGCTT 420 

ATGACAGGAC ATGCTATTCC ACCCAGCCAA TTGGATTCTC AGATTGATGA CTTCACTGGT 480 

TTCAGCAAAG ATAGGATGAT GCAGAAACCT GGTAGCAATG CACCTGTGGG AGGAAACGTT 540 

ACCAGCAGTT TCTCTGGAGA TGACCTAGAA TGCAGAGAAA CAGCCTCCTC TCCCAAAAGC 600 

CAACAAGAAA TTAATGCTGA TATAAAACGT AAATTAGTGA AGGAACTCC6 ATOOGTTGGA 660 

CAAAAATATG AAAAAATCTT CGAAATGCTT GAAGGAGTGC AAGGACCTAC TGCAGTCAGG 720 

AAACGATTTT TTGAATCCAT CATCAAGGAA GCAGCAAGAT GTATGAGAC6 AGACTTTGTT 780 

AAQCACCTTA AGAAGAAACT GAAACGTATG ATTTGAGAAT ACTTGTCCCT GGAQGATTAT 840 

CACACCCCAA ATGCATAATC TCATTAATGA TTGAGGAGAG AAAAGGATCA GATTGCTGTT 900 

TTCTACAATG GAGCAGGATA TTGCTGAAGT CTCCTGGCAT ATGTTACCGA ATCAACTGGC 960 

CTTCCAGAGG CTAAGAAATT TCTGTTAGTA AAAGATGTTC TTTTTCCCAA AGCGTTTTAT 1020 

TTGAAAGGAT AACTTGTGTT TTGGTTATTT TGTATTCCCA CCKSTGCTGG TAGATATTAT 1080 
TAACCCATTA GQTAAATACT ATTACAGTCQ TGGTTTCTGC A 

Seq ID NO: 298 Protein sequence: 
Protein Accession #: Eos sequence 

} 11 21 31 41 51 

) I I I I i 

MTDKTEKVAV DPETVFKRPR ECDSPSYQKR QRMALLARKQ GAGDSLIAGS AMSKEKKLMT 60 

GHAIPPSQLD SQIDDPTGFS KDRMMQKPGS NAPVGQIVTS SPSGDDLECR ETASSPKSQQ 120 

EINADIKRKL VKBLRCVGQK YEKIEBMLBO VQGPTAVRKR PFESHKEAA RCMRRDFVXH 180 
LKKKLKRNI 

Seq ID NO I 299 DNA sequence 

Nucleic Acid Accession #: Bos sequence 

Coding sequence: 247-815 

} 11 21 31 41 51 

> I I ) I I 

AOTGTTCGGC TGGGGCAGGC ACGCTGT6GC TGGCTACTTC CCTTCCTCCC ATCCCCCTTG 60 

GGCCAAACGG GATCGGTGCT TCTGGTGAGA CGCCTCCCCA TGCACATCAC TCCCAGGTGC 120 

CCTAGGGGGC ACATTTCCCA CAACTCCCAG AGGGCAGGTT TCTAGAAAGT GCCACCAGTG 180 

GGGAGGCGCC ACAACTTCAC TGCCATTTTG TGAGGTGCCG CCGTCTCTCC TCCAGCAAGG 240 

QRAACAATQA CCGATAAAAC AGAGAAGGTG GCTGTAGATC CTGAAACTGT GTTTAAACGT 300 

CCCAGGGAAT GTGACAOTCC TTCGTATCAO AAAAGGCAGA GGATGGCCCT GTTGGCAAGG 360 

AAACAAGGAG CAGGAGACAG CCTTATTGCA GGCTCTGCCA TGTCCAAAGC AAAGAGCTTA 420 

TGACAGGACA TGCTATTCCA OOCAGCCAAT TGGATTCTCA GATTGATGAC TTCACTCGTT 480 
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WO 02/a86443 

TOVGCAAAGA TAOGATGATO CA6AAACCTG GTA6CAATGC 
OCAGCAGTTT CTCTGGAGAT OACCTAGAAT GCAiGAGAAAC 
AACAAGAAAT TAATGCTGAT ATAAAACGTA AATTAGTGAA 
AAAAATATGA AAAAATCTTC GAAATGCTTG AAGGAGTGCA 
AAOaATTTTT TQAATCCATC ATCAAG6AA6 CAGCAAGATQ 
AGCACCTTAA GAAGAAACTG AAACX3TATGA TTTGAGAATA 
ACACCCCAAA TGCATAATCT CATTAATGAT TGA66AGAGA 
TCTACAATGG AGCAGGATAT TGCTGAAGTC TCCTGGCATA 
TTCCAGA6GC TAAGAAATTT CTGTTAGTAA AAQATGTTCT 
TGAAAG6ATA ACTTGTGTTT TGCTTATTTT QTATTCOCAC 
AACCCATTAG GTAAATACTA TTACA0TGST GGTTTCTGCA 

Seq ID NO I 300 Protein sequence; 
Protein Acceasion #t Eos sequence 



ACCTGTGGGA 
AGCCTCCTCT 
GGAACTCCGA 
AGGACCTACT 
TATQAGAOSA 
CTTGTCCCTQ 
AAAGGATCAG 
TGTTACOQAA 
TTTTCCCAftA 
CTGTGCTGGT 



GGAAACGTTA 
CCCAAAAGCC 
TGCGTT6GAC 
GCAGTCAG6A 
GACTTTGTTA 
GAGGATTATC 
ATTGCTQTTT 
TCAACTGQCC 
GCGTTTTATT 
AGATATTATT 



51 



1 11 21 31 41 

I I I I I i 

MTDKTBKVAV DPETVFKRPR ECDSPSYQKR QRHALLARKQ GAGDSLIAGS AMSKAKKLMT 
GHAIPPSQLD SQIDDFTGFS KDBMMQKFGS NAPVGGNVT8 SF8GDDLECR ETASSPK8QQ 
EIMADIXRKL VKELRCVGQK YEKIFBHIiBG VQ6PTAVRKR FFBSIIKBAA RCNSRDFVKH 
LKKKLKRNI 

Seq ID NO: 301 DNA sequence 

Nucleic Acid Accession ftt Eos sequence 

Coding sequence i 247-812 



AGTGTT0G6C 
GOCCAAAOQO 
CCTAGGGG6C 



GAAACAATGA 
CCCAGGGAAT 
AAACAAG6AG 
TGACAGGACA 
TCAGCAAAGA 
CCAGCAATTT 
AACAAGAAAT 
AATATGAAAA 
QATTTTTTGA 
ACCTTAAGAA 
CCCCAAAT6C 
ACAATGGAGC 
CAGAGGCTAA 
AAGGATAACT 
CCATTAGGTA 



11 
I 

TGGGGCAGGC 
QATCQGTGCT 
ACATTTCXX3V 
ACAACTTCAC 
CGGATAAAAC 
GTGACA6TCC 
CAGGAGACAG 
TGCTATTCCA 
TGGGATGATG 
CTCTGGAGAT 
TAATGCTGAT 
AATCTT06AA 
ATCCATCATC 
GAAACTGAAA 
ATAATCTCAT 
AGGATATTGC 
GAAATTTCTG 
TGTGTTTTGG 
AATACTATTA 



21 
I 

AOGCTGTGGC 
TCTGGTGAGA 
CAACTCCCAG 
TGCCATTTTG 
AGAGAAGGTG 
TT06TATCAG 
CCTTATTGCA 
CCCAGCCAAT 
CAGAAACCTG 
GACCTAGAAT 
ATAAAATGTC 
ATGCTTGAAG 
AAGGAAGCAG 
CGTATGATTT 
TAATGATTGA 
TGAAGTCTCC 
TTAGTAAAAG 
TTATTTTGTA 
CAGTCGTGGT 



31 

I 

TGGCTACTTC 
OGCCTCCCCA 
AGGGCAGGTT 
TGAGGTGCCG 
GCTGTAGATC 
AAAA6GCAGA 
GGCTCTGCCA 
TG6ATTCTCA 
GTAGCAATGC 
GCAGAGQAAT 
AABTAGTGAA 
GAGTGCAAG6 
CAAGATGTAT 
GAGAATACrr 
06AGAGAAAA 
TGGCATATGT 
ATGTTCTTTT 
TTCCCACCIQ 
TTCTGCA 



41 

1 

CCTTCCTCCC 
TGCACATCAC 
TCTAGAAAGT 
CCGTCTCTCC 
CTGAAACTGT 
GGATGGCCCT 
TGTCCAAAGA 
QATT6ATGAC 
ACCTGTGGGA 
AOCCTCCTCT 
GQAAATCCGA 
ACCTACTGCA 
GAGACGAGAC 
GTCCCTGGAQ 
G6ATCAGATT 
TACCXaAATCA 
TCCCAAAGGS 
TQCTGOTAGA 



51 
I 

ATCCCCCTTG 
TCCCAGGTGC 
GCCACCAGTG 
TCCAGCM6G 
OTTTAAA06T 
GTTGGCAAGG 
AAAGAGCTTA 
TTCACTGGTT 
GGAAATGTTA 
CC CftAA AGCC 
TQCCTTGGAC 
GTCAGGAAAC 
TTTGTTAAGC 
GATTATCACA 
GCTGTTTTCT 
ACTG6GCTTC 
TTTTATTTQA 
TATTATTAAC 



Seq ID NO: 302 Protein sequence: 
Protein Accession #: Bos sequence 

1 11 21 31 41 51 

I I I I I I . 

MTDKTEECVAV DPETVFKRPR BCDSFSVOKR QRMALLARKQ GAGDSIiIAGS AMSKEKXZMT 
QIAIPPSQU3 SQIDDFTGFS KDGMKQKFG8 MAFVGGNVT8 HP8GDDI«&CR GZASSPKSQQ 
EINADIKOQV VKEIRCLaQY EKIFSCiEGV Q6PTAVRKRF FBSIIKEAAS OIRRDFVKRL 

KKKLKRMZ 

Seq ID NO: 303 DNA sequence 

Nucleic Acid Accession #: Bos sequence 

Coding sequence: 247-815 



1 
I 

AGTGTTGGGC 
GGCCAAACAG 
CCTA6GGGGC 
GGOAGGOGCC 
GAAACAATGA 
CCCAGGGAAT 
AAACAAGGA6 
TGACAGGACA 
TCAGCAAAGA 
CCAGCA6TTT 
AACAAGAAAT 
AAAAATATGA 
AACGATTTTT 
AGCACCTTAA 
ACACCCCAAA 
TCTACAATGG 
TTCCAGAGGC 
TGAAAGGATA 
AACCCATTAG 



11 
I 

TGGGACAGGC 
GATCGGTGCT 
ACATTTCCCA 
ACAACTTCAC 
COQATAAAAC 
GTGACAGTCC 
CAGGAGACAG 
TGCTATTCCA 
TAGGATGATG 
CTCTGQAGAT 
TAATGCTGAT 
AAAAATCTTC 
TGAATCCATC 
GAAGAAACTG 
TGCATAATCT 
AGCAGGATAT 
TAAGAAATTT 
ACTTGTGTTT 
GTAAATACTA 



21 

1 

AGGCTGTGGC 
TCTGGTGAGA 
CAACTCCCAG 
TGCCATTTTG 
AGAGAAGGTG 
TTOGTATCAG 
CCTTATTGCA 
CCCAGCCAAT 
CAGAAACCTG 
GACCTAGAAT 
ATAAAACGTA 
GAAATGCTTG 
ATCAAGGAAG 
AAACGTATGA 
CGTTAATGAT 
TGCTGAAGTC 
CTGTTAGTAA 
TGGTTATTTT 
TTACAGTCGT 



31 

i 

TGGCTACTTC 
OSTCTCCCCA 
AGGGCAGGTT 
T8AG6TGC0Q 
GCVQTAOATC 
AAAA6GCAGA 
GGCTCTGCCA 
T6GATTCTCA 
QTAGCAAT6C 
GCAGAGAAAC 
AATTAGTGAA 
AAGGAGTGCA 
CAGCAAGATG 
TTTGAGAATA 
TGAG6AGAGA 
TCCTGGCATA 
AAGATGTTCT 
GTATTCCCAC 
GGTTTCTGCA 



41 

1 

CCTTCCTTCC 
TGCACATCAC 
TCTAGAAAGT 
CCGTCTCTCC 
CTGAAACTGT 
GGATGGCCCT 
TGTCCAAAGC 
GATTGATGAC 
ACCTGTGGGA 
AGCCTCCTCT 
GGAACTCCGA 
AGGACCTACT 
TATGAGAOGA 
CTTGTCCCTG 
AAAGGATCAG 
TGTTACGGAA 
TTTTCCCAAA 
CTQT6CTQ0T 



51 

I 

ATCCCCCTTG 
TCCCAGAT6C 
GCCACCAGTG 
TCCAGCAAGG 
GTTTAAAOGT 
GTTGGCAAGG 
AAAGAGCTTA 
TTCACTGGTT 
GGAAACGTTA 
CCCAAAAGCC 
TGOGTTGGAC 
G CAGTCAGG A 
GACT^TTGTTA 
GAGGATTATC 
ATTGCTGTTT 
TCAACTGGCC 
QQGTTTTATT 
AGATATTATT 



540 
600 
660 
720 
780 
840 
900 
960 
1020 
lOBO 



PCTAJS02/12476 



60 
120 
180 



60 
120 
ISO 
240 
300 
360 
420 
460 
540 
600 
660 
720 
7B0 
840 
900 
960 
1020 
1080 



60 
120 
180 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 



Seq ID NOi 304 Protein sequence: 
Protein Accession ft: Eos sequence 
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8eq ID NO: 305 DMA sequence 

Nucleic Acid Accession ftt Bos sequence 

Coding sequence i 07-689 



WO 02/086443 

1 11 21 31 41 51 

I I I I I I 

MTDKTEKVAV DPETVFKRPR ECD8PSYQKR QRMAIiLARKQ GAQDSLIAGS AMSKAXKLHT 
6EAIPPSQLD SQIDDFTGFS KDRHHQKP6S NAPVGGNVTS SFSGDDLBCR BTASSFRSQQ 
EINADIKRKL VKELRCVGQK YEKIFEHLBQ VQQPTAVRKR FFBSIIKBAA RCHRRDFVXH 



51 

1 

CCTGCGCTGC 
CCAAAGG6CA 
AAGAAAAACC 
TGGAAGAC6A 
GTGCX^CTATG 
GATCCTAATG 
CCCAAGATCy\ 
GA6ATGTGGA 
CTGAAGGAOA 
6CAAAGG6TC 
GAGGAAGAAG 
TTGTQAATAC 
TGTTGCCCTC 
GTGCTCTAGA 
AACTGTATCR 
TCTCCTCACT 
TTTTCTTTTT 
CAACTGTATA 
TCCTTGCTCG 
CGTGCACTGT 
TCTTGTTTCT 
GGTCA6CTGG 
TTAAftCAAAC 
GTTCAAQAAC 
TTAGAATGCT 
CTATTATAAC 
6GAGACTGGC 
AGGGCTGAGA 
TGGCATTCAG 
TCXaAGGTAT 
ACCTCCACCC 
GGTGTGTCAA 
GGCACATATC 
CCTGOATGCC 
TT6CTCCTAC 
CCTGA6CTAT 
AAOTTGAATG 
TGTTAAACTG 
TTTTATCCAG 
ATTTTGAAGG 

CTATGAAACA 
AAAATTCACT 
CCTGGATGAG 
TTOGCTCTCXI 
ACTAATTTAA 
TCCTGCTTCA 
TCGTGATGTG 
AACAAOCTGC 
ATOTAOOUIC 
CTAAGCAGAA 

ttaaag;^c 

TCAACGTTTT 
GAGATGATGT 
ACTGTCT6T6 
AGGATAGTAC 
6GAAAGAACA 
TGTCAOUUIT 



PCT/US02/12476 



60 
120 
180 



1 
I 

C6TGGAGGCA 
CCA6ACTAGC 
AGATGTCCGC 
CAGAGGTCCC 
T6TCCGGGAA 
ATCGGGAAAT 
CTCCCAAAAG 
AATCCACAAA 
ATAATTTAAA 
A6TATGAGAA 
CTGCTAAAGT 
AGGAGGAGGA 
TTAGA6TAGG 
ATTAGGTTTA 
AATTGTCAGT 
AAGTT6TACA 
CTGT6CACT7 
ATTTGTAAG6 
TATCTATAGT 
GCGTTGAGGC 
GAGGCTGGAC 
GTATATA6TG 
CATQA6AATA 
TATAGAACTC 
CTCCTGTACT 
GAAATGTTTT 
AGTCAATTTC 
CTCCCTATAA 
T6AAGGAGAG 
ATGAAGTCT6 
AGGAA6GTGG 
CCTATTTTGT 
AAATTAAGGC 
ACATTATTTX3 
AGTGAX3GGTA 
TTGGAAACAC 
AGGQCAOGCT 
GGCT6CTCAT 
GAAGGAGCTT 
GTTGGGGTGA 
CTGATGTGTA 
GTGAGTGTTG 
OGCAGGAGTG 
GTTGAGAAAC 
TAOGAGTTAT 
CATCAGAACT 
AAAACTCGGT 
GCCATGTOCT 
CTAS6CCAAQ 
GQTCAAAAGG 
CAGGGAASGG 
GTG6GGGAGC 
T6TCAAGTTG 
CACACTGTGG 
AAGTGGCATQ 
TTCTTAASAT 
CTACTCCCTC 
AGGCTGTGAT 
GTATTTGGG6 



11 

I 

GCTAGCGCGA 
GAACAATACA 
TTATGCCTTC 
TGTCAATTTT 
AGAGAAATCT 
GAAGGATTAT 
GCCACCGTCT 
CCCCGGCATC 
TGACAGT6AA 
GGATGTTGCT 
TGCCOGGAAA 
G6AGGAGGA6 
GGAGC6CCGT 
ATTACAAAAT 
GGTTTACATG 
TATTTCCAAA 
TGCTOTTGGT 
TGGTG6TAAC 
TTGTAAAAAG 
TGT60GGAAG 
CTGTTGACrC 
ACATAGCATT 
TTTTTTTTTT 
TTCATTarCA 
TAAACACGAT 
TGAAGTTAAA 
TGACTCACAG 
ATGTGGTAGC 
GGCTACTTGA 
GAGGAGTTAG 
GTGATTAG6A 
GGGGCCAAAT 
CTTATTGTTT 
TGGTGCCCAA 
TGT6GQATGG 
CAAACACCCC 
AATGGAATCA 
AAGTTTAGCT 
GGTTTGTGTG 
GGGGftGATGG 
TATACATCAT 
CTATTGCCCA 
TTTTTGTGCT 
TTGCATQTCT 
GGTCACGGTC 
GTTTGTCCTG 
TGTGAGQTTT 
TGTCACTTGG 
ATTCGGGA6C 
GAGTGATTTG 
GCAAGGATGG 
AGTTTA6CCA 
AGGCCACTTG 
CAAGATTGCT 
ATGTTACCTA 
GCCAACCTGT 
TAACCACCTC 
6TA6TGACTA 
ACGTTCGATG 




CATTCATTTT 



Seq ID NOt 306 Protein sequence: 
Protein Accession NP 005333.1 



11 



21 
I 



31 41 51 



MAKGDPKKPK GKMSAyAPFV QTCREBHKKK NPBVPVNFAE FSKKCSBRWK IMSGKEKSKP 
D91AICADKVR YDREMKDYGP AKGGKKKXDP NAPKRPPSGF FLFCSEPRPK IKSTNPGISI 
GDVAKKLGEM WNNIJiDSEKQ PYITKAAKLK EKYEKDVADY KSKGKFDGAK GPAKWARKKV 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
7B0 
840 
900 
960 
1020 
1080 
1X40 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2B20 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 



60 
120 
180 



Seq ID NO: 307 DNA sequence 
Nucleic Acid Accession #: NM_022342 
Coding sequence: 1..217e 



299 



wo 02/086443 



I 11 21 31 41 51 

I I I I I I 

ATGGGTACTA GGAAAAAAGT TCATGCATTT GTCCX3TGTCA AACCCACCGA TGACTTTGCT 60 

CATGAAATGA TCAGATACGG AOATGACAAA AGAAGCATTG ATATTCACTT AAAAAAAGAC 120 

ATTCGGAGA6 GAGTTGTCAA TAACCAACAG ACAQACTGOT CGTTTAAGTT GGATGQAGTT 180 

TTCAG6ATG GCTCGCAOGA CTTGQTTTAT GftfiACAGTTG CAAAfiGATGT GGTTICTCAQ 240 

CCCTCGATO GCTATAATGO CACCATCATO TOrTATOOGC AOACGGGAGC TGGCAAGACA 300 

ACACCATGA TGGGQGCAAC TQAGAATTAC AAGCACCGGG GGATCCTCCC TC3GTQCCCTQ 360 

AGCAGGTTT TTAGGATGAT CQAAQAACGC CCCACACATG CCATCACTGT GC3GTGTTTCC 420 

ACTTGGAAA TCTATAATGA GAGCCTGTTT GATCTCCTGT CCACTCTGCC CTATGTTGGA 480 

CCTCftOrCA CACCAATGAC CATOGTGOAA AACCCTCAAG GAGTCTTCAT TAAGGGCTTQ 540 

CAQTTCACC TCACAAOTCA QGAGQAGGAT GCATTCAGCC TCCTTTTTQA GQgr OAQAC C 600 

ACAGOATTA TAGCCTCCCA CACTATGAAC AAAAACTCTT CCAQATCACA CTGCATTTTC 660 

CCATCTACT TAGAGGCCCA TTCCCGGACC TTATC3U3AGG AAAA6TACAT CACTTCCAAA 720 

TTAACTTGG TGGATCTGGC AGGCTCAGAG AGGCTGGG6A AGTCTGGGTC TGAGGGCCAA 780 

TCCTGAA06 AAGCCACCTA CATCAACAAA TCX3CTCTCAT TCCTGGAGCA GGCCATCATT 84-0 

CCCTTGGGG ACCAGAA6CG GGACCACATC CCCTTTCGGC AGTGCAAGCT CACCCAC6CT 900 

TQAAGGACT CGTTAGGGGG AAACTGCAAT ATGGTCCTCG TGACAAACAT CTATGGAGAA 960 

CTGCCCAGT TAGAAGAAAC GCTATCTTGA CTGAGATTTG CCAGCAGGAT GAAGCTAGTC 1020 

CCACTGAGC CTGCCATCAA TQAAAAGTAT QATGCTGAGA GAATGGTCAA GAACCTGGAO 1080 

AG6AACTAG CACTACTCAA GCAGQAGCTG GCTATCCATG ACAGCCTGAC CAACCJQCACC 1140 

TTGTGACCT ATGACXTCAT GGATGAAATC CAGATTGCTG AGATCAACTC CCAGGTGOGG 1200 

GOTACCTGQ AGGGGACACT GGAC6AGATC GACATAATCA GCCTTAGACA GATCAAGGAG 1260 

TGTTCAACC AGTTCCXSGGT GGTTCTGAGC CAACAGGAAC AGGAAOTGGA OTCCACTTTO 1320 

GCAGGAAGT ACACCCTCAT TX3ACAGGAAT GACTTTGCAG CCATTTCTGC TAT CCAG AAO 1380 

06GQGCTTG TQGATGTTGA TGGCCACCTA GTGGGTGAGC CTGAAGGACA AAACTTTGGA 1440 

TCGGAGTCG CCCCTTTCTC TACCAAACCT GGGAAGAAAG CCAAGTCCAA GAA GACAT TC 1500 

AAGAGCCAC TCAGGCCC6A CACCCCACCC TCCAAACCAG TGOCCTTTGA GGAGTTTAAG 1560 

ATGAGCAAG GTAGTGAGAT C3UVCCGAATT TTCAAAQAAA ACAAATCCAT CTTGAATGAA 1620 

GGAGGAAAA GGGCCAGCGA GACCACAC3U3 CSVCATCAATG CCATCAAGC3G GGAGATTGAT 1680 

TGACCAAGG AGGCCCTGAA TTTCCAGAAQ TCACTACGGG AGAAGCAAGG CAAGTAC3GAA 1740 

ACAAGGGGC TGATGATCAT CGATGAOQAA GAATTCCTGC TGATCCTCAA GCTCAAAGAC 1800 

TCAAGAAGC AGTACOGCAO OQAOTACCAG 6ACCTGCGTG ACCTCAGGGC TGAGATCCAG 1860 

ATT6CCA6C ACCTAOTOGA TGAOTOTOGC CACOOOCTGC TCATGGAATT TGACATCTGG 1920 

ACAATGAGT CCTTTOTCAT CCCTGAGGAC ATGCAGATGG CACTGAAOCC AGGCGOCAGC 1980 

TCCX3GCXAG GCAT60TCCC TGTGAACAGG ATTGTGTCTC TGGGAGAAGA TGACCAGGAC 2040 

AATTCAGCC AGCTGCAGCA QAGGGTGCTT CCTGAGGGCC CTGATTCCAT CTCCTTCTAC 2100 

ATGGCAAAG TCAAGATAGA GCAGAAGCAT AATTACTTGA AAACCATGA7 G6GCCTCCA6 2160 
AGGCACATA GAAAATA6 



Seq ID NOt 308 Protein sequence i 
Protein Accession # : NP_071737 



1 11 21 31 41 51 

I i I I I I 

MGTRiCKVHAP VRVKPTDDPA HEMIRYGDDK RSIDIKMCKD XRRSWNNQQ TDWSPKLDGV 60 

LHDASQDIiVY BTVAKDWSQ ALDGXNGTIM CYGQTGAGKT YTMKGATENY KHRQILPRAL 120 

QQVFRMIEBR PTHAITVRVS YLEIYNESIiF DLLSTLPYVG PSVTPMTIVE NPQGVPIKQL 180 

SVHLTSQEED AFSLLPSGBT MRIIASHTMH RBTSSRSHCIP TIYLEAHSRT LSEBiariTSK 240 

HTLVDLAGSE RLGRSGSEGQ VLKBATYINK SLSFLBQAII AL6DQKRDBI PFRQCKLTHA 300 

LKDSLGGNCN MVIjVTNIYGE AAQLEETLSS LRPASHMKLV TTEPAINEKY DAERMVKNLE 360 

KBLAliLXQBL AIHDSLTNRT PVTYDPMDEI QIAEINSQVR RYLEGTLDEI DIISUIQIKB 420 

VFHQPRWLS QQEQEVBSTL RRKYTIiIDRN DPAAISAIQK AGLVDVDGHL VGEPEGQNFG 480 

LGVAPFSTKP GKKAKSKKTP KBPLRPDTPP SXPVAPEEFK NEQGSEINRI PKENKSILHE 540 

lOKRASETTQ HIMAIKREID VTKEALNFQK SUtEKOGKYE NKGLHIIDEE EPLLILKIiRD 600 

LKKQYRSEYQ DLRDXiRABIQ YOQHLVDQCat HRLLMEFDZH YNESFVIFED MQNALRFGG8 660 

IRPGMVPVNR IVSLQEDDQD KPSQLQORVL PBGFD8ISFY NAKVKIEQXH NYLKTMMOLQ 720 
QAHRK 



8eq ID NOt 309 DNA sequence 

Nucleic Acid Accession #: CAT cluster 



I 11 21 31 41 51 

I I I I I I 

' lYmvrm ' TTTTTTTTAA TGCCTGCTGT GATGCTCTGT CTACCAGGGT GAATTTCCAA 60 

AAATTTCTGC ATAGCAATTT TAGCCAAAAC TATATATGTT CTGGGGAGGA TAGGC ATAGG 120 

CACATTGAAG ACCAAAGGAA AGAGTGAA6A AGT6TAGTT6 G6TCATTGT6 AATGGATGTT 180 

TAGATTGTCA AGAAAAOTGG GCCAGAGGCC CX^lCCTCACa CTAGGAOGGC AATTGCCTCT 240 

CATTAGTATC TCAG6CACCA TGGGTCTTAT TTGGTGTCAT AAGAAACACC CTC AACAAAG 300 

TAATGAACCC TCAGCCTCCA GCTTCTCTTC TTCGGGATTC TTCTTAGGGC CTCCTTTTTC 360 

CTTTTATOTT TCCAGTACCC TGAATTTCTT ATTCCCATCC CXX*ATTAAAA TCTGCTTCAA 420 

AGAAAAAACA AGAAGGACAC ATTCACTTTA AGATCCAAAT GAATGATAAG AGCTTAAAAC 480 
ATTATACTTA TCAOTATTAT TTGCATTTTT ATAGAAACCA AAAOCATATT TGAACAAC 



Seq ID NO: 310 DNA sequence 

Nucleic Acid Accession #: NM_018622.2 

Coding sequence! 1-1140 

1 11 21 31 41 51 

I I i I I I 

ATGGG6TGGC GAGGCTGGGC GCAGAOAGGC T6GGGCTG06 GCCAGGCGTG GGGTGGGTCG 60 

GTGGGOQGCC GCAGCT6G6A 6QAGCTCACT GCG6TGCTAA CCCGGCCGCA GCTCX:TC36GA 120 

OQCAGGTTTA ACTTCTTTAT TCAACAAAAA T60G6ATTCA GAAAA6CACC CA OGAAO GTT 180 

GAACCTOGAA GATCAGACCC AGGGACAAGT GGTGAAGCAT ACAAGAGAAG TGCTTTGATT 240 

CCTCCTGTGG AAGAAACAGT CTTTTATCCT TCTCCCTATC CTATAAGGAG TCTCATAAAA 300 

CCTTTATTTT TTACTGTTGG GTTTACAGGC TGTGCATTTG GATCAGCTGC TATTTGGCAA 360 
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TATGAATCAC TGAAATCCAG GGTCCAGAGT TATTTTGATG GTATAAAAGC TQATTGGTTQ 420 

GATAGCATAA GACCACAAAA AGAAGGAGAC TTCAGAAAGG AGATTAACAA GTOOTOGAAT 480 

AACCTAAGT6 ATGGCCAGCO GACTGTGACA GGTATTATAG CTGCAAATOT CCTTOTATTC 540 

TGTTTAT6GA GA GTAC CTTC TCTGCAGCGG ACAATGATCA 6ATATTTCAC ATCGAATCCA 600 

GCCTCAAAGQ TCCTTTGTTC TCCAATGTTG CTGTCAACAT TCAGTCACTT CTCCTTATTT 660 

CACATG6CA6 CAAATATGTA TGTTTTGTGG AGCTTCTCTT CCA6CATAGT GAACATTCTG 720 

GSTCAAGAGC AGTTCA7G6C AGTOTACCTA TCTQCAGGTO TTATTTCCAA TTTTGTCAGT 780 

TACCT6G6TA AAGTTGCCAC AGGAAGATAT GGACCATCAC TTG6TGCATC TGGTGCCATC 840 

ATQACAGTCC TOGCAGCTGT CTGCACTAAG ATCCCAGAAG GGAGGCTTGC CATTATTTTC 900 

CTTCOGATGT TCACX3TTCAC AGCAGGGAAT GCCCTGAAAG CCATTATCGC CATGGATACA 960 

GCAGGAATGA TCCTGGGATG GAAATTTTTT GATCAT6C6G CACATCTTGG GGGAGCTCTT 1020 

TTTGGAATAT GGTATGTTAC TTACGGTCAT GAACTGATTT GGAAGAACAQ OGAGOOGCTA 1080 
GTGAAAATCT GGCATGAAAT AAG6ACTAAT GGCCCGAAAA AAGGAaOrGG CTCTAAGTAA 



Seq ID NOt 311 Protein sequence: 
Protein Accession ft: NP_061092.2 

1 11 21 31 41 51 

1111)1 

MAWRGWAQRG W6CQQAWGAS VGGRSCEELT AVLTPPQLLQ RRFNFFIQQK CGFRKAPRKV 60 
EPRRSDPGTS GEAYRRSALI PPVEETVPITP SPYPIRSIiZX PIiPFTVGPTG CMGSAAI*fQ 120 
YESLKSRVQS YPDGIKADWL DSIRPQKE6D FRKEINKHNN NLSDGQRTVT GIIAANVLVP ISO 
CLMRVPSLQR TMIRYFTSNP ASKVLCSPML LSTPSHFSLP BMAANMYVLW SPSSSIVNH, 240 
GQBQPMAVYL SAGVISNFVS YLGKVATGRY GPSLGASGAI MTVLAAVCTK IPEGRLAIIP 300 
LPMPTFTAGN ALKAUAMDT AGMILGWKFP DHAAHLGGAL PGIWYVTYGH EI.IWKNREPL 360 
VKIWHEIRTN GPKKGGGSK 



Seq ID NO: 312 DMA sequence 
Nucleic Acid Accession #t iaN_000625 
Coding sequence: 195.. 3 656 ~ 



1 11 21 . 31 41 51 

I I I I I I 

CTCTOGGCCA CCTTT6AT6A GGGGACIGGG CAGTTCTAGA CA6TCC0SAA GTTCTCAAGG 60 

CACAGGTCTC TTCCTGGTTT GACTGTCCTT ACCC0GQG6A GGCAGTGCAG CCAGCTGCAA 120 

GCCCCACAGT GAAGAACATC TGAGCTCAAA TCXZAGATAAG TGACATAAGT GACCTGCTTT 180 

GTAAAGCCAT AGAGATGGCX: TGTCCTT6GA AATTTCTGTT CAAGACCAAA TTCCACCAGT 240 

A7GCAAT6AA TGGGOAAAAA QGCATCAACA ACAATGTGGA QAAAGOXXX: TGT6GCACCT 300 

CCAGTCCA6T GACACAGGAT GACCTTCA6T ATCACAACCT CAGCAAGCA6 CAGAATGAGT 360 

CCCCGCAGCC CCTC3GTGGAG ACGGGAAAGA AGTCTCCAGA ATCTCTGGTC AAGCTG6ATG 420 

CAACCCCATT GTCCTCCCCA OGGCATGTGA GGATCAAAAA CTGGGGC3VGC GGGATGACTT 480 

TCCAAGACAC ACTTCACCAT AAflGCCAAAG GGATTTTAAC TTGCAGGTCC AAATCTTGCC 540 

TGGGGTCCAT TATSACTCGC AAAAGTTTGA CCAGAGGACC CAGGGAGAA6 CCTACCCCTC 600 

GRGATOAGCT TCTACCTCAA GCTATC6AAT TTGTCAACCA ATATTACX3GC TCCCTCAAAG 660 

AGGCAAAAAT AGAGGAACAT CTGGCCAGGG TGGAAGCXSGT AACAAAGGAG ATAGAAACAA 720 

CAGTAACCTA CCAACTGACG GGAGATGAGC TCATCTTCGC CACCAAGCAG GCCTGGCGCA 780 

ATGCCCCACG CTGCATTGGG AGGATCCAGT GGTCCAACCT GCAGGTCTTC 6ATGCCCGCA 840 

GCTGTTCCAC TGCCCGGGAA ATGTTTGAAC ACATCXOCAG ACACGTGCOT TACTOCACCA 900 

ACAATGGCAA CATCAGGTCG GCCATCACCX3 TGTTCCCOCA GOOGAGTOAT GGCSUVOCAOO 960 

ACTTCCGGGT GTGGAATGCT CAGCTCATCC GCTATGCTGG CTACCAGATG CCAGATGGC3^ 1020 

GCATCAGAGG GGACCCTGCC AAOGTGGAAr TCflCTCAGCT GTCCArOGAC CMG6CTGGA 1080 

AGCCCAAGTA CGGCCX3CTTC QATGTGGTCC CCCTGGTCCT GCAGGCCAAT GGCOGTCACC 1140 

CTGAGCTCTT OSAAATCCCA CCTGACCTTG TGCTTGAGGT GGCCATGGAA CATCCCAAAT 1200 

ACX3AGTGGTT TCGQGAACTG GAGCTAAAGT GGTACXSCCCT 6CCTGCAGT6 6CCAACATGC 1260 

TGCTTGAGGT GGGCGGCCTG GAGTTCCCAG GGTGCCCCTT CAATGGCTGG TACATGGGCA 1320 

CAGAQATCGG AGTC0G6GAC TTCTGTGATG TCCA6C6CTA CAACATCCTG GAG6AAGTGG 1380 

GCAGGAGAAT GGGCCTGGAA ACGCACAAGC TGGCCTOGCT CTGGAAAGAC CAGGCTGTCG 1440 

TTGAGATCAA CATTGCTGTG CTCCATAGTT TCCAGAAGCA GAATGTGACC ATCAT6GACC 1500 

ACCACTCGGC T6CAGAATCC TTCATGAAGT ACATGCAGAA TGAATACCGG TCCOGTGGGG 1560 

GCTGCCCGGC AGACTG6ATT TGGCTGGTCC CTCCCATGTC TGGGAGCATC ACCCCOGTGT 1620 

TTCACCAGGA GATGCTGAAC TAOGTCCTOT CCCCTTTCTA CTACTATCAG GTAOAaGCCT 1680 

GGAAAACCCA TGTCTGGCAG GACGAGAAGC GGAGACCCAA GAGAA6AGAG ATTCCATTGA 1740 

AAGTCTTGGT CAAAGCTGTG CTCTTTGCCT OTATGCTOAT 6CGCAAGACA ATGGCGTCCC 1800 

GAGTCAGAGT CACCATCCTC TTTGCGACAG AGACAGGAAA ATCAGAGGCG CTGGCCTGGG 1860 

ACCTGGGGGC CTTATTCAGC TGTGCCTTCA ACCCCAAGGT TGTCTGCATG GATAAGTACA 1920 

G6CTGAGCTG CCTGGAOQAG 6AA0QGCTGC TQTTGOTGGT GAOCAGTAOG TTTGGCAATG 1980 

6AGACTGCCC TGGCAATGGA GAGAAACTGA AGAAATC6CT CTTCATGCTG AAAGAGCTCA 2040 

ACAACAAATT CAGGTAOGCT GTGTTTGGCC TOGGCTCCAG CATGTACCCT CGGTTCTGCG 2100 

CCTTTGCTCA TOACATTGAT CAGAAGCTGT CCCACCTGGG GGCCTCTCAG CTCACCCCGA 2160 

TGGGAGAAQG QGATGAGCTC AGTGQGCAGG AGGACGCCTT CCGCAGCTGG GCCGTGCAAA 2220 

CCTTCAAGGC AGCCTGTGAG ACGTTTGAT6 TOOGAGQCAA ACA6CACATT CAGATGCCCA 2280 

AGC TCTA CAC CTCCAATGTG ACCTGGGACC CGCACCACTA CAGGCTCQTG CAGQACTCAC 2340 

AGCCTTTGGA CCTCAGCAAA GCCCTCAGCA GCATGCATGC CAAGAACGTG TTCACCATGA 2400 

GGCTGAAATC TOGGCAGAAT CTACAAAGTC OQACATCCAG COGTGCCACC ATCCTGGTGG 2460 

AACTCTCCTG TGAG6ATGGC CAAGGCCTGA ACTACCTGCC GGGGGAGCAC CTTGGGGTTT 2520 

GCCCAGGCAA CCAGCCGGCC CTGGTCCAAG GCATCCTGGA GCGAGTGGTG GATGGCCCCA 2580 

CACCCCACCA GGCAGTGCGC CTGGAGGCCC TGGATGAGAG TGGCAGCTAC TGGGTCA6TG 2640 

ACAAOAGGCT GCCCCCCTGC TCACTCAGCC AGGCCCTCAC CTACTTCCTG GACATCACCA 2700 

CACCCCCAAC OCAGCTGCIG CTCCAAAAGC TG0CCCA6GT GGCCACAGAA GAGCCTGAGA 2760 

QACAGAGGCT G6AGGCCCTG TGCCAGCCCT CA6AGTACAG CAAGTGGAAG TTCACCAACA 2820 

GCCCCACATT CCTGGAGGT6 CTAGAGGAGT TCCCGTCCCT GCGGGTGTCT GCTGGCTTCC 2880 

TGCTTTCCCA GCTCCCCATT CTGAAGCCCA GGTTCTACTC CATCA6CTCC CCCCGGGATC 2940 

ACACGCCCAC GGAGATCCAC CTGACTGTGG CCGTGGTCAC CTACCACACC CGAQATQQCC 3000 

AGGGTCCCCr GCACCACGGC GTCTGCAGCA CATGQCTCAA CAGCCTGAAG CCCCAAGACC 3060 

CAGTGCCCTG CTTTGTGCX3G AATGCCAOOG GCTTCCACCT OCCCGAGQAT CCCTOCCATC 3120 
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CTTGCATCCT CATOGGGCCT GGCACAGGCA TCt3CXX:CCTT CCGCAGTTTC TGGCAGCAAC 3180 

GQCTCCftTGUV CTCCCAGChC AA6GGAGT6C GGGOAGGCCXS CA.TQAGCTT6 6T6TTTGG6T 3240 

GCCGC06CGC AGATQAGGHC CACATCTACC AG6AG6AGAT 6CTGGAGATG GCCCAGAAGG 3300 

GGGTGCTGCA TGOSGTGCAC ACA6CCTATT CCCGCCTGCC TGCCAAGCXX AAGGTCTATG 3360 

TTCAGGACAT CCTGCGGCAG CAGCTGGCCA GCGAGGTGCT CCGTGTGCTC CACAAGGAGC 3420 

CAGGCCACCT CTATGTTTGC GGGGATGTGC GCATGGCCCG GGACGTGGCC CACACCCTGA 3480 

AOCAOCTGOT GGCTGCCAA8 CTGAAATTOA ATGAGGAGCA GGTCGAGGAC TATTTCTTTC 3540 

AGCTCAAGA6 CCAGAAGGQC TATCACGAAiQ ATATCTTTGQ TGCTGTATTT CCTTA03AGG 3600 

CQAAQAAGGA CAGGGTGGOS GTGCAGCCCA GCAGCCTGQA GATGTCAGCG CTCTGAQGOC 3660 

CTACAOGAQG GGTTAAAGCT GCOGGCACAG AACTTAAGGA TGGAGCCAGC TCTGCATTAT 3720 

CT6AGGTCAC AGGGCCTGGG GAGATGGAGG AAAGTGATAT CCCCCAGCCT CAAGTCTTAT 3780 

TTCCTCAAC6 TTGCTOCCCA TCAAOGOCTT TACTTQAGCT CCTAACAA6T AGCACOCTGG 3840 
ATTGATGGGA GCCTC 

seq ID NO: 313 Protein sequence: 
Protein Accession #x NP_000616 

1 11 21 31 41 51 

I I I I I I 

MACPWKFLPK TRFHOYAMNG BKGINNNVEK APCATSS^ QDDLQYHIHiS KQQKESPQPL 60 

VETQKKSPES LVKLDATPLS SPRHVRIKNW GSGMTFQDTL HHKAKGILTC RSKSCMSIM 120 

TPKSLTRGPR DKPTPPDELL PQAIEFVNQY YGSLKEAKIB EHLARVEAVT KEIETTVTYQ 180 

LTGDSLIFAT KQAMIUiAPRC IGRIOWSNLQ VECARSCSTA RE^4FEHICRH VRySTNNGKI 240 

RSAITVFPQR 8DGKHDFRVM NAQLIRYAOY QMFDGSIROD PAHVBFTQLC ZDUSHKPKYG 300 

RFDWPLVLQ ANGRDPSLFB IPPDLVIiBVA MEHPKYEW9R ELELKHYAIiP AVANMLLBVG 360 

GLEPPGCPFN GWYMGTEIGV RDPGDVQRYN ILEEVGRRKQ LETHKIASLW KDQAWEINI 420 

AVIiHSFQKQN VTIMDHHSAA BSFMKYMQNE YRSRGGCPAD WIWLVPPMSG SITPVPHQEM 480 

LNYVXiSPFYY YOVEANKTHV WQDBKRRPKR RBIPLECVLVK AVIiFACMZiMR XTMASRVRVT 540 

ILFATET6KS EALANDLQAL FSCAFNFKW CMDKYRLSCL RRBRTiTJiWT STFCaiQDCPO 600 

NGEKLKRSLF MLKELNNKFR YAVFGIiGSSM YPRFCAFABD IDQKLSHLGA SQLTFMGEGD 660 

ELSGQEDAFR SWAVQTFKAA CBTFDVRGKQ HIQIPKLYTS NVTWDPHHYR LVQDSQPLDL 720 

SKALSSMHAK NVFTMRLKSR QNLQSPTSSR ATILVELSCE DGQGIjNYLPG EHLGVCPGNQ 780 

PALVQGZLER WDGFTPEQA VRLBALOSS6 SYWVSDKRXiP PCSLSQALTY FLDZTTPPTQ 840 

UiLQRLAQVA TBBPERQRLE ALCSQPSBYSK NKFTNSFTFL EVLBBPPSLR VSAGFLLSQL 900 

PZLKPRFYSZ 6SFRDHTFTE IHLTVAWTY HTROGQGPIiH EGVCSTWUTS LKPQDPVPCP 960 

VRNASGPHLP EDPSHPCILI GPGTGIAPPR SFWQQRLHDS QHKGVRGQRM TLVFGCRRPD 1020 

BDHIYQBH4L EMAQKGVLHA VHTAYSRLPG KPKVYVQDIIi RQQLASEVLR VL3SXEPGBLY 1080 

VGGDVRMARD VAHTLKQLVA AKUCLNBEQV E0YFFQZiKSQ KRYHEDZFGA VFPYEAKKDR 1140 
VAVQPSSLEH SAL 

Seq ID NO I 314 ONA sequence 
Nucleic Acid Accession #t XM_087254 
Coding sequence : 4 7 23 32 

1 11 21 31 41 51 

! I 1 1 I I 

AGAGTACGTG TTTACAGATA AAACTGGTAC ACTQACAQAA AATGAGATGC AGTTTOGGGA 60 

ATGTTCAATT AAT6GCATGA AATACCAAGA AATTAATGGT AGACTTOTAC CCGAA66ACC 120 

AACACCA6AC TCTTCAGAAG OAAACTTATC TTATCTTAOT AGTTTATCCC ATCTTAACAA 180 

CTTATCCCAT CTTACAACCA QTTCCTCTTT CA6AACCAGT CCTGAAAATG AAACTGAACT 240 

AATTAAAGAA CATGATCTCT TCTTTAAAGC AGTCAGTCTC TGTCACACTG TACAGATTAG 300 

CAATGTTCAA ACTGACTGCA CTGGTGATGG TCCCTGGCAA TCC3iACCTGG CACCATCGCA 360 

GTTGGftGTAC TATGCATCTT CACCAQATGA AAA6GCTCTA GTAGAAGCT6 CTGCAAGGAT 420 

TG6TATTGTG TTTATTGGCA ATTCTGAAGA AACTATGGAO GTTAAAACTC TTGGAAAACT 480 

QGAACGGTAC AAACTGCTTC ATATTCTGGA ATTTQATTCA GATCGTAGGA GAATGAGTGT 540 

AATTGTTCAO GCACCTTCAG GTGAGAAGTT ATTATTTGCT AAAGGA6CTQ A6TCATCAAT 600 

TCTCCCTAAA TGTATAGGTG GAGAAATAGA AAAAACCAGA ATTCATGTAG ATGAATTTOC 660 

TTTOAAAGaO CTAAGAACTC TGTGTATAGC ATATAGAAAA TTTACATCAA AAGAGTATGA 720 

GGAAATAGAT AAACGCaXAT TTGAAGCCAG QACTGCCTTG CAGCAGCGGG AAGAGAAATT 780 

GGCAGCTGTT TTCCAGTTCA TAGAGAAAGA CCTGATATTA CTTGGAGCCA CAGCAGTAGA 840 

AGACAGACTA CAAGATAAAG TTCGAGAAAC TATTGAAGCA TTGAGAATGG CiWlATCAA 900 

AGTATGGGTA CTTACTGGGG ATAAACATGA AACAGCTGTT A6TGT6AGTT TATCATGTGG 960 

CCATTTTCAT AOAACCATGA ACATCCTTGA ACTTATAAAC CAGAAATCAG ACAGCGAGTG 1020 

TGCTGAACAA TTQAGGCAGC TTGCCAGAAG AATTACAGAG GATCATGTGA TTCAGCATGG 1080 

GCTGGTAGTG GATGGGACCA GCCTATCTCT T6CACTCAGG GAGCATGAAA AACTATTTAT 1140 

GGAAGTTTGC AGAAATTOTT CAGCTGTATT ATGCTOTOOT ATGGCTCCAC TGCAG AAAGC 1200 

AAAAQTAATA AGACTAATAA AAATATCACC TGAGAAACCT ATAACATTGG CTGTTGGTOA 1260 

TGGTGCTAAT GACGTAAGCA TGATACAAGA AGCXX:AT0TT GGCATAGQAA TCATGGGTAA 1320 

AGAAGGAAGA CAGGCTGCAA GAAACAGTGA CTATGCAATA GCCAGATTTA AGTTCCTCTC 1380 

CAAATTGCTT TTTGTTCATG GTCATTTTTA TTATATTAGA ATAGCTACCC TTGTACAGTA 1440 

TTTTTTTTAT AA6AATGTGT GCTTTATCAC ACCCCAOTTT TTAIATCAST TCTACTGTTT 1500 

GTTTTCTCAG CAAACATT6T ATGACA6G6T GTACCTQACT TTATACAATA TTTGTTTTAC 1560 

TTCCCTACCT ATTCTGATAT ATAGTCTTTT GQAACAOCAT GTAGACCCTC ATGTGTTACA 1620 

AAATAAGCCC ACCCTTTATC GAGACATTAG TAAAAACCGC CTCTTAAGTA TTAAAACATT 1680 

TCTTTATTGG ACGATCCUGG GCTTCAGTCa TOCCTTTATT TTCTTTTTTG GATCCTATTT 1740 

ACTAATAGOG AAAGATACAT CTCT6CTTGG AAATG6CCAG ATGTTTGGAA ACTG6AGATT IBOO 

TGGCACTTTG GTCTTCACAa TCAT6QTTAT TACAGTCACA GTAAAGATG6 CTCTGGAAAC 1B60 

TCATTTTTGG ACTTGGATCA ACCATCTOGT TACCTGGGGA TCTATTATAT TTTATTTTGT 1920 

ATTTTCCTTG TTTTATGGAG GGATTCTCTG GCCATTTTTG GGCTCCCAGA ATATGTATTT 1980 

TGTGTTTATT CAGCTCCTGT CAAGTGGTTC TGCTTGGTTT GCCATAATCC TCATGGTTGT 2040 

TACATGTCIA TTTCTTGATA TCATAAAGAA GGTCTTTGAC OGACACCTCC ACCCTACAAG 2100 

TACTGAAAAG GCACAGCTTA CTGAAACAAA TGCAGGTATC AAGT6CTTG6 ACTCCAT6T0 2160 

CTGTTTCCOG GAAGGAGAAG CAGCGTGTGC ATCTGTTGGA AGAATGCTGG AAC3GAGTTAT 2220 

AGGAAGATGT AGTCCAACCC ACATCA6CAG ATCATGGAGT GCATCGQATC CTTTCTATAC 2280 

CAACGACAGG AGCATCTTGA CTCTCTCCAC AATGGACTCA TCTACTTGTT AAAGQGGCAG 2340 
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TAGTACTTTG TGGGAGCCAG TTCACCTCCT TTCCTAAAAT TCASTGTGAT CACCCTGTTA 2400 

ATGGCCACAC TAOCTCTGAA ATTAATTTCC AAAATCTTTG TAGTAGTTCR TACCCACTCA 2460 

GAGTTATAAT GGCAAACAAA CAGAAAGCAT TACSTACftAGC CCCTCCCAAC ACCCTTAATT 2520 

TGAATCTQAA CATGTTAAAA TTTGAGAATA AAGASACATT TTTCATCTCT TTGTCTGGTT 2S80 

TQTCXrCTTGT OCTTATGGGA CTCCTAATGG CATTTCAGTC TGTTGCTGAG GCCATTATAT 2640 

TTTAATATAA ATGTAGAAAA AAGAGASAAA TCTTAGTAAA GAGTATTTTT TAGTATTAGC 2700 

TTQATTATTG ACTCTTCTAT TTAAATCTGC TTCTGTAAAT TATGCTGAAA GTTTGCCTTG 2760 

AGAACTCTAT TTTTTTATTA GAGTTATATT TAAAGCTTTT CATGGGAAAA GTTAATGTGA 2620 

ATACTGAGGA ATTTTGGTCX: CTCAGTGACC TGTGTTGTTA ATTCATTAAT GCATTCTGAG 28B0 

TTCACAGAGC AAATTAGGAG AATCATTTCX: AACCATTATT TACTGCI^TA TGGGGAGTAA 2940 

ATTTATACCA ATTCCTCTAA CT6TACTGTA ACACAGCCTG TAAAGTTAGC CATATAAATG 3000 

CAAGGGTATA TCATATATAC AAATCAGGAA TCAGGTCC6T TCACCGAACT TCAAATTGAT 3060 

GTTTACTAAT ATTTTTGTGA CAGAGTATAA AGACCCTATA GTGGGTAAAT TAGATACTAT 3120 

TAGCATATTA TTAAT^AAT GTCTTTATCA TTGGATCTTT T6CAT6CT7T AATCT6GTTA 3180 

ACATATTTAA ATTTGCTTTT TTTCTCTTT& CX^XSAAGGCT C T GTOTATAG TATTTCATGA 3240 

CATCGTTGTA CAGTTTAACT ATATCAATAA AAAGTTT66A OUSTATTTAA ATATTGCAAA 3300 

TATGTTTAAT TATACAAATC AGAATAGTAT GGGTAATTAA ATGAATACAA AAAGAAGAGC 3360 

CTCTTTCTGC AGC^SACTTA GACATGCTCT TCCCTTTCTA TAAGCTA6AT TTTAGAATAA 3420 

AGGGTTTCAQ TTAATAATCT TATTTTCAG6 TTATGTGATC TAACTTATAG CAAACTACCA 3460 

CAATACAGTO AGTTCTGCCA GTGTCCCAGT ACAAGGCATA TTTCAGGTGT G6CTGIGGAA 3S40 

TGTAAAAATG CTCAACTTGT ATCAGGTAAT GTTA6CAATA AATTAAATGC TAAGAATGAT 3600 

TAATOGGGTA CATGTTACTG TAATTAACTC ATTGCACTTC AAAACCTAAC TTCCATCCTG 3660 

AATTTATCAA GTAGTTCAGT ATTGTCATTT GTnTTGTTT TATTGAAAAG TAATGTTGTC 3720 

TTAA6ATTTA GAAGTGATTA TTAGCTTGAG AACTATTACC CAGCTCTAAG CAAATAATGA 3760 

TTQTATACAT ATTAA6ATAA TQGTTAAATQ OQQTTTTAOC AAGTTTTCCC TTGAAAAT3T. 3840 

AATTCCTTTA TG6AGATTTA TTGTGCAGCC CTAAGCTTCC TTCCCATTTC ATGAATATAA 3900 

GGCTTCTAGA ATTGGACTGG CAGGGGAAAG AATGGTA6AG ACAGAAATTA AGACTTTATC 3960 

CTTGTTTGCT TGTAAACTAT TATTTTCTTG CTAATGTAAC ATTTGTCTGT TCCAGTGATG 4020 

TAAGGATATT AAGTTATTAA 6CTAAATATT AATTTTCAAA AATAGTCCTT CTTTAACTTA 4080 

GATATTTCAT AGCTOOATTT AGOAAaATCT OTTATTCTGG AAGTACTAAA AA6AATAATA 4140 

CAACX3TACAA TOTCTQCATT CACTAATTCA T8TTCCAGAA 6AGGAAATAA TGAAQATATA 4200 

CTCAGTAGAG TACTAGGTGQ 6AGGATATGG AAATTT6CTC ATAAAATCTC TTATAAAACQ 4260 

TGCATATAAC AAAATQACAC CCAGTAGGCC TGCATTACAT TTACATGACC GTQTTTATTT 4320 

GCCATCAAAT AAACTGAGTA CT6ACACCAG ACAAAGACTC CAAAGTCATA AAATAGCCTA 4380 

TGACCAACTG CAGCAAGACA GGAGGTCAGC T06CCTATAA TGGT6CTTAA AGTGTGATTG 4440 

ATGTAATTTT CTGTACTCAC CATTTGAAGT TAGTTAAGGA GAACTTTATT TTTTTAAAAA 4500 

AAGTAAATGG CAACCACTAG TGTGCTCATC CTGAACTOTT ACTCCAAATC CACTCCGTTT 4560 

TTAAAGCAAA ATTATCTTGT 6ATTTTAAGA AAAGAGTTTT CTATTTATTT AAGAAAGTAA 4620 

CAATGCAGTC TGCAA6CTTT CAGTAGTTTT CTAGTGCTAT ATTCATCCTG TAAAACTCTT 4680 

ACTACGTAAC CA6TAATCAC AAQQAAAGTG TCCCCTTTGC ATATTTCTTT AAAATTCTTT 4740 

CTTTGGAAAG TATGATGTTG ATAATTAACT TACCCTTATC TGCCAAAACC AGAGCAAAAT 4800 

GCTAAATACG TTATTGCTAA TCAGTGGTCT CAAATCGATT TGCCTCCCTT TGCCTCGTCT 4860 

GAGG6CTGTA AGCCTGAAGA TAGTGGCAAG CACCAAGTCA GTTTCCAAAA TTGCCCCTCA 4920 

GCTGCTTTAA GTGACTCAGC ACCCTGCCTC AGCTTCAGCA GGCGTAG6CT CACCCTGGGC 4980 

GGAGCAAAGT ATGGGCCAGG GAGAACTACA 6CTACGAAGA CCTGCTGTCG AGTTGAGAAA 5040 

AGGGGAGAAT TTATGGTCTG AATTTTCXAA CTGTCCTCTT TCTTGGGTCT AAAGCTCATA 5100 

ATACACAAAG GCTTCCAGAC CTGAGCCACA CCCAGGCCCT ATCCTGAACA GQAGACTAAA 5160 

CAQAGQAAA TCAACCCTAG GAAATACTIG CATTCTGCCC TACGOTTAGT ACQUSGACTO 5220 

AGGTCATTTC TACTGGAAAA 6ATTGTGAGA TTGAACTTAT CTGATOGCTT GAGACTCGTA 5280 

ATAGGCAGGA GTCAAGGCCA CTAGAAAATT GACAGTTAAQ AGCCAAAAGT TTTTAAAATA 5340 

TGCTACTCTG AAAAATCTCG TGAAGGCTGT AGGAAAAGGG AQAATCTTCC ATGTTGGT6T 5400 

TTTTCCTGTA AAGA TCAGTT T GGGG TATGA TATAA GCAGG TATTAATAAA AATAACACAC 5460 

CAAAGAGTTA 06TAAAACAT QTTTTATTAA TTTTGGTCCC CAC6TACAGA CATTTTATTT 5520 

CTATTTTGAA ATGAGTTATC TATTTTCATA AAAGTAAAAC ACTATTAAAG T6CTGTTTTA 5580 

TGTGAAATAA CTTGAAT6TT GTTOCTATAA AAAATAGATC ATAACTCAT6 ATATGTTTGT 5640 

AATCATGGTA ATTTAGATTT TTAT6AGGAA TGAGTATCTG GAAATATTGT AGCAATACTT 5700 

GGTTTAAAAT TTTGGACCTG AGACACTGTG GCTGTCTAAT GTAAT CCTTT AAAAATTCTC 5760 

TGC3VTTGTCA OTAAATQTAG TATATTATT6 TACAGCTACT CATAATTTTT TAAAGTTTAT 5820 
GAAGTTATAT TTATCAAATA AAAACTTTCC TATAT 

Seq ID NO: 315 Protein sequence t 
Protein Accession #; XP 087254 



1 11 21 31 41 51 

I i ] I I I 

KQPRBCSING MKYQEINGRL VPEGPTFDSS EGNLSYLSSL SHLNNLSHLT TSSSFRTSPE 60 

NETELIKEHD LFFKAVSLCH TVQISNVQTD CTGDGPWQSN LAPSQLEYYA SSPDEKALVE 120 

AAARIGIVFI. GNSEETHEVK TLGKLERYKL LHILSFDSDR RRKSVIVQAP SGEKLLFAKG 180 

AESSILPKCI GGEIEKTRIH VDBFALKGIiR TLCIAYRKFT SKSYEEXDKR IFEARTALQQ 240 

REBKLAAVFQ FIEKDLILLG ATAVEDRIiQD KVRBTZBALR MAOZKVWVLT ODKBETAVSV 300 

SLSCGBFHRT MNZLELZNQK SDSBCAEQLR QLARRITBDH VIQHGLWDG TStiSLALRSH 360 

EKLFKEVCRtf CSAVLCCRMA PLQKAKVIRL IKISPEKPIT LAV6DGAMDV SMIQEAHVGI 420 

GIMGKEGRQA ARNSDYAIAR FKFIjSKI.LPV HGHFYYIRIA TLVQYFFYKN VCFITPQFLY 480 

QFYCLFSQQT LYDSVYIiTLY NICFTSLPIL lYSLLEQHVD PHVLQNKPTL YRDISKNRLD 540 

SIRTFLYirri LGFSHAFIFF FGSYLLZGKD TSLLGNGQHF GNWTFGTLVF TVMVITVTVK 600 

MALBTHFIVTW ZNKLVTHGSI ZFYFVFSLFY GGZLWPFU3S QNMYFVFZQL L8S6SAHFAI 660 

ZLHWTCIiFIi DIIKKVFDRH LHFTSTEKAQ LTBTNAGZKC LD8HCCFPBG BAACASVGRK 720 
LERVZGRCSP TBZSRSWSAS DPFYTNDRSZ LTLSTMDSST C 

Seq ZD NO: 316 DNA sequence 
Nucleic Acid Accession #: NM_004473 
Coding sequence: 661.. 1791 " 

1 11 21 31 41 51 

I I I I I I 

CTG6CCAGC3Q GTC060GGG6 CTGGAGACCC AG6CCGT06A GAGGACCAGC CTCAGGTCGC 60 
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CCaSCCTGGO CC0GC3GCCCC GACCTCGCTG CCCCCGCCTC GCCTCTCTGC COSTGGCGCr 120 

'TACOGCaU:C TKXSOCTOGG GGGCAGGGCft TGGGGGGCXIC COGGCAGATC GCCCAQCQCC 180 

AGTACTAACT GCCCTCGCTC TGGCCTTOGA GCCXS3AA6CC TCTTCTGCGC OCACAACCTA 240 

GGCAGTAATC CTAAACXAGC GGGCACCACA GACCAGCTGC AGCCACCCCA ACCCAOGGAT 300 

CACTTCCGGA CCCCTCGACC QCCCGGCACC AQCGCGCAAG GGACCCTTCA GCCGGAGACX: 360 

AOAQTCCAGT CC0GGT0G06 AGGCCACOGC CGCTGCCCGC CTaSAGAAGC ACRACGCGGG 420 

CEGAGCG6TC G6CTAG0GGG TCACTOCOOA GCCTCTOTCT GCAGOQCGCX: AGCCCCAGAC 480 

CACGGAOGCT GAGCCTCCAG CGC6C3GCCAO CCTO G SC C JOC TGGGCTCTCC GGGCCAGCCC 540 

GCGACGATCC CCTGAGCTCT CCGCAQAAGO 0CCQAGCX3TC CGTTCCGGGG AOGCCAGGCC 600 

CGCCCCCGCC CCCCGACAGC CGCX3GGGATC CAOAGCCCGG GGGTGCX3GGA CX5CC0GCGCC 660 

ATGACTGCGG AGAGCX3GGCC 6C0GCC0C0G CAGCOSGASG TGCTGGCTAC CGTGAAGGAA 720 

GAGOQCXSGOG ASAGGGCAaC A6GGGCC606 GTCCCAGGG6 AGGCCAOGQG C0GCGGGGCX3 780 

QGCGGGCGGC GCOGCAAGCG CCCCCT6CA6 CX3CJ0GGAAGC OGCCCTACAG CTACATCGCQ 840 

CTCATOGCCA TGGCCATCGC GCACGCGCCC GAGC3GC0GCC TCACGCTGGQ CX3GCATCTAC 900 

AAQTTCATCA CCGAGCGCTT CCCXTTTCTAC CGCGACAACC CCAAAAAGTG GCAGAACA6C 960 

ATCOGCCACA ACCTCACACT CAACGACTGC TTCCTCAAGA TCCC6CGCGA GGCOGGCCGC 1020 

CCGG6TAA6G GCAACTACTG GGCGCTOGAC CCCAAC6CGG AGGACATGTT CGAGAG06GC 1080 

AGCTTCCTGC GCCGCOGCAA OCOCTTCAAG CGCTCGGACC TCTCCACCTA CCCGGCTTAC 1140 

ATGCACX3A0G CX3GCGGCTGC CGCAGCOGCC GCTGCOQCAQ COSCOGCOGC CGCOSCCXSCC 1200 

GCC3GCCATCT TCCCAGGCGC GGTGCCOGCC GCGCGCCCCC CCTACCOGGG OGCO GTCTAT 1260 

GCAGGCTACX3 CGCOGCCGTC GCTGGCOGCG COSCCTCCAG TCTACTACCC CGCG6CGTGG 1320 

CCC3GGC0CTT GCCGOGTCTT CGGCCTGGTT CCTGAGOGGC CGCTCAGCCC AGAGCTGGGG 1380 

GCGGCAOOGT OGQGGCCXXSG OSGCrCTTGC GCCTTTGCCT CCGCCGGCGC CCCCGCTACC 1440 

ACCACCGGCT ACCAGCCCSGC AGGCTGCACC GGGGCC0Q6C CGGCCAACCC CTCTGC CTAT ISOO 

GCGiSCTGCCT ACGOGGGCCX: GGACX3GC36CG TACCC3GCAGG GCGCOGGCafl TGOGATCTTT 1560 

6C0GCT6CTG GCCGCCTGGC GGGACCOGCT TCGCCCCCAG CGGGC3GGCA6 CAGTaGGGGC 1620 

GTGGAOACCA CX5GTGGACTT CTACGGGCGC ACGTCGCCCG GCCAGTTaJQ AGCXSCTGGGA 1680 

GCCTQCTACA ACCCtGGCGG GCAGCTOGGA GGGGCCAGTG CAGGC3GCCTA CCATGCTCGC ' 1740 

CaTGCTGCOG CTTATCCCGG TGGGATAGAT CX5GTTCX3TGT CCGGCATGIG AQCCA6CGTA 1800 

GGGACGAAAA CTCATAGACA CATCGGCTGT TCACACXJTTC CCCXSCAACCT GAGAAOGAAC 1860 

AGGAATGGAG AGAGGACTCA ACTGGGACCC ACGTGGAAAA GACX3SA6CAG GCCACAGAGG 1920 

CTOQGTCTCC CCGCX3CACAG CGTAGGCACC CTGTGTACTC TGTAAACGGG AGGAGGTGGG 1980 

GCX3AGGCA6C CAfiAGCCCTT GGACTGGCAC AGGGACCCTC GATGGAGOSA AGCCCTCAAA 2040 

OGGGATCCTT TCTGGCATTC TATOGGGGAG GGTCCTTGGC GG7AACCAGA GGGCAGCGTA 2100 

GTGTCAACAC CA6A6ACCAG GATCCAAATT GTGGGGAATC AGTTTCAGCC T TCCAT GTGC 2160 

TGCCGGAACT CGGGCCTTTT TAOGCJGQTTC GTCCTCTAGT GCCTTTAACT 606TTACTAC 2220 

AATAAAAGGC TGCX3GCA0C0 CCTTTCTTCT TAAAGTGAGG AGGACAAATT TGCAAAAGAA 2280 

ATAGGCTTTT Cr rCi Tm ' i ' AAATTGQAGA AATCTCTGCT CTGGTTGACC TGGQCTGGTT 2340 

TTCCCTGTCT CXGAGAACTT GA6ACCTAGC TCCGAGTTGA ACTGTGCGTC AGCACTGCAG 2400 

TCCCATCACC TGAAOCTTCA GTCTCCCCCA TCIGTTACAC TAGAGGGCTG CAOGACTCTA 2460 

TCCACOSCCC CCGGQTTATC ATTCAGGGCC CCATCATCTT 6GAT0CTGCC CT0a3TATTT 2520 

GGCAGCAATG GTGGGCCACC CAGGGCCTCT GAGTAGCCAC CCAAAGCCTA GCOGCTGTTC 2580 

TAGGGAACGG AAAAGAGTTC ATGGCCAAGC QTCTAACCTA AAGTCCCAGG ATTGGCTCCA 2540 

GGCAGCAATT ATATCATAAC TTATTGAACT TTTGAGCAGG ACGTGCTGGT AATTTCATGG 2700 

CIGTTACXGC CCAGTCATAA ATCTGCTTTT CCATTATAAG GCAGAGAGAA GTACATTCOT 2760 

TCATTTGTCC ACTGTTTCTT GTCATCAOGC AQCCCTGGAC CCAAAOGGTG AACTAAAGTT 2820 

TAAGGAGATG AGAGGATTCA AGGAGCCCOT TGGTGAOGCC TTTCAGTAGC TGQQQAGQGC 2880 

TCTTCCATCC CCAGCACCCC CTGCTACACC TCAGCAGCCT CCCCCAT6CA AAAAGGAAAG 2940 

AGAAAAATTA AGTTAGGGCA GTCAGTAAAG TGAGCTTTAG AAAGAAACTG GAATTTTAAC 3000 

TTCATTTTGT ATCTTGCTTA AGTAGC3MSK: TCACTAAAAT TAGAOAAAGT CCAATAACTC 3060 

TCCCCCTTTC CCTTGAGAAA TCTTTAA6TT TOGATTCTGO AGCAAAAACT TTCAOCATTA 3120 

AATATTTOIG AGGCTCCATT CACAGCTTTC AGATAAACTG GAGTGTTCAG ATGGACTGTT 3180 

TTAATAAAAA TCTTTGAGCA AGTGAGTTAT GGCAAGAGAA ACTCAGCCTC TTTCTGTATA 3240 

AACTTAACAG GGAAGGGCTG GGGTGTQAAA AAGAAGATTG TATGAAAACC ATTGGTAATT 3300 

TTTATTTTTT ATTTTTCGGA CTGCACTATC CTOTTCAOQA AGACATOTOA ACTTGGTTCA 3360 

GTGCAAATG6 G6ATTTGTAT AAACCAGTGC TCICCATTAG AAATATGGTG CAAGCCACAT 3420 

A3GTAATTTT AAATATTCIA GTAGGCAO^T TAATAAAGTN AAAAOAAACA AAAAAAAAAA 3480 
AA 

8eq ID MOt 317 Protein sequence i 
Protein Accession #i 1IP_004464 

1 11 21 31 41 51 

I I I I t I 

FKHLTBKROI DTRANSCRIP TIOMPACTOR TTFMTAESGP PPPQPEVLAT VKBERGETAA 60 

GAGVPGEATG RGAGGRRRKR PlJQRGKPpyS YIALIAMAIA HAPERRLTLG GIYKFITERF 120 

PFYRDNPKKW QKSIRHNLTb NDCFLKIPRE AGRPGKam? ALDPNABDMP ESGSFLRRRK 180 

RFKRSDLSTY PAYMHDAAAA AAAAAAAAAA AAAAAIPPGA VPAARPPYPG AVYAGYAPPS 240 

LAAPPPVYYP AASPGPCRVF GLVPERPLSP EL6PAPSGPG GSCAFA5AGA PATTTGYQPA 300 

GCXGARPANP SAYAAAYAGP DGAYPQ6ASS AIFAAAGRLA GPASPFAGGS SGGVETTVDF 360 
YGRTSPGQFG AL6ACYNPGG QLGGASAGAY BARHAAAYPG 6ZDRFVSAN 

Seq ID NO I 318 UNA sequence 
Ifucleic Acid Accession #: NM_00S688 
Coding sequences 126.. 4439 

1 11 21 31 41 51 

I I |. I ' • I cn 

CCGGGCAGGT GGCTCATGCT COGGAQCGTO GTTGAGOGGC TGGOGCGGTT GTCCTGGAGC 60 

AGGGGOGCAG GAATTCTGAT GTQAAACTAA CAGTCTGTGA GCCCTGGAAC CTCCGCTCAG 120 

AGAAGATGAA GGATAT06AC ATAGGAAAAG AGTATATCAT CCCCAGTCCT GGGTATAGAA 180 

6TGTGAGGGA GAOAACCAGC ACTTCTGGGA CGCACAGAGA CCGTGAAGAT TCCAA6TTCA 240 

GGAGAACTCG ACCGTTGGAA TGCCAAGATG CCTTGGAAAC AGCAGCCCGA GCOSAGGGCC 300 

TCTCTCTTGA TGCCTCCATG CATTCTCAGC TCAGAATCCT GGATGAGGAG CATCCCAAGG 360 

GAAAGTACCA TCATGGCTTG AGTGCTCTGA AGCCCATCCG GACTACTTCC AAACACCAQC 420 

ACCCAGTGGA CAATGCTGGG CTTTTTTCCT GTATGACTTT TTCGTGGCTT TCTTCTCTGG 480 
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CCC3GTGTaGC CCACAAGAAG GGGOAGCTCT CAATGGAAGA CGTGTGGTCT CTGTCCAAGC 540 
ACGAflTCTTC TGACX3TGAAC TOCAGAAQAC TAGAGAGACT GTCGCAAGAA QAGCTCAATG 60 0 
AAGTTGGGCC AGACOCTGCT TCCCTGCX3AA GGGTT0TGT6 GATCTTCTGC OGCACCAGGC 660 
TCATCCTGTC CATOSTGTGC CTGATGATCA CGCAGCTOGC TGGCTTCAOT 6SACCAGCCT 720 
TCATGGTGAA ACACCTCTTG QAGTATACCC AGGCAACAGA 6TCTAACCTG CaCTACAGCT 780 
TGTTGTTAGT GCTGGGCCTC CTCCTGAOGQ AAATOGTGCG GTCTTGGTCG CTTGCACTGA 840 
CTTGQGCATT GAATTACCGA ACCGGTGTCC GCTTC06QGG GGCCATCCTA ACCaTGGCAT 900 
TTAAQAAGAT CCTTAAOTTA AAGAACATTA AAGAGAAATC CCTGGGTGAG CTCATCAACA 960 
TTTGCTCCAA CGATGGGCAG AGAATGTTTG AGGCRGCAGC OSTTGGCAGC OTGCTGGCIG 1020 
GAGGACCCGT TGTTGCCATC TTAGGCATGA TTTATAATGT AATTATTCTG GGACC3VACAG 1080 
GCTTCCTGGG ATCAGCTGTT TTTATCCTCT TTTACCCAGC AATCATGTTT GCATCACGGC 1140 
TCACftGCATA TTTCRQ(»GA AAATGC5GT6G CCGCX3WGGA TOAACGTGTC CAGAAGATGA 1200 
ATGAAGTTCT TACTTACATT AAATTTATCA AAATGTATGC CTGGGTCSiAA GCATTTTCTC 1260 
AGA6T6TTCA AAAAATCCGC GAQGAGGAGC GTCGGATATT GGAAAAAGCC GGGTACTTCC 1320 
AGGGTATCAC TGTGGGTGTG GCTCCCATTG TGGTGGTGAT TGCCAGCGTG GTGACCTTCT 1380 
CTGTTCATAT GACCC TGGG C TTCGATCTGA CA6CAGCACA GGCTTTCACA 6TGGTGACA0 1440 
TCnOATrC CaTGACTTTT GCTTTOftAAG TAACACCGTT TTCAtSTAAAG TCOCTCTCAO 1500 
AAGCCTCAGT GGCTOTTOAC AGATTTAAOA GTTTOTTTCT AATGQAAGAO GTTCACATGA 1560 
TAAAGAACAA ACCAGCCAGT CCTCACATCA AGATAGAGAT GAAAAATGCC ACCTTGGCAT 1620 
GGGACTOCTC CCACTCCAGT ATOCAGAACT CGCCCAAGCT GACCCCCAAA ATGAAAAAA6 1680 
ACAAGAGGGC TTCCAQGGGC AAQAAAGAGA AGGTQAGGCA GCTGCAGCGC ACTGAGCATC 1740 
AGGOSGTGCT GGCAGAGCAG AAA6GCCACC TOCTCCTGGA CAGTGAOGAG CGGCXCAGTC 1800 
CCGAAGAGGA AGAAGGCAAG CACATCCACC TGGOOCACCT GC3GCTTACA0 AGGACACTGC 1860 
ACAGCATCGA TCTGGAGATC CAAGAGGGTA AACTGGTTGG AATCTGCGGC AGTGTGGGAA 1920 
GTGGAAAAAC CTCTCTCATT TCAGCCATTT TAGGCCAGAT GACGCTTCTA GAG6GCAGCA 1980 
TTGCAATCAG TGQAACCTTC GCTTAT6TGG CCCAGCAGGC CTGGATCCTC AATGCTACTC 2040 
TGAGAGACAA CATCCTGTTT GGGAAGGAAT ATGATGAAGA AAGATACAAC TCTGTGCTOA 2100 
ACAGCTGCTG CCTGAGGCCT GACCTGGCCA TTCTTCCCAG CAGOGACCTG AOOGAGATTO 2160 
GAGAGCX5AQG A6CCAACCTG AGCGGTG66C AGOGCCAGAG GATCAGCCTT GCCCGGGCCT 2220 
TGTATAGTGA CftGGAGCATC TACATOCTGG AOGACCCCCT CAGTGCCTTA GATGCCCATQ 2280 
TGGGC3ACCA CATCTTCAAT AOTGCTATCC GGAAACATCT CAAGTCCAAG ACAGTTCTGT 2340 
TTGTTACCCA CCAGTTACAG TACCTGGTTG ACTGTGATGA AGTGATCTTC ATQAAAGAGG 2400 
GCTCTATTAC GGAAA GAGGC ACCXaTGAGG AACTGATOAA TTTAAATGGT GACTAT6CTA 2460 
CCATTTTTAA TAACCTGTTG CTGGGAGAGA CAC0GCCA6T TGABATCAAT TCAAAAAAGG 2520 
AAAOCAGTGG TTCACAGAAG AAGTCACAAG ACAAGGGTCC TAAAACASOA TCAGTAAAGA 2580 
AGQAAAAAGC AQTAAAGCCA GAGGAAGGGC AGCTTGTGCA GCTGQAAGAG AAAGGGCAGG 2640 
GTTCAGTGCC CTGGTCAGTA TATGGTGTCT ACATCCAGGC TGCTGGGGGC CCCTTCGCAT 2700 
TCCTGGTTAr TATGGCCCTT TTCATGCTGA ATGTAGGCAG CACCGCCTTC AGCACCTGGT 2760 
GGTTGAGTTA CTGGATCAAG CAAGGAAGOS GGAACACCAC TGTGACTCGA GGGAACGAGA 2820 
CCTCGGTGA6 TGACAQCATO AAGGACAATC CTCATATGCA OTACTATGCC AOCATCTAOG 2880 
CCCTCTCCAT GGCAGTCAT6 CTGATCCTQA AAGCCATT06 A6GAGTT6TC TTTGTCAAGG 2940 
GCACGCTGGG AGCTT CCTCC CGGCTQCAT6 ACX3AGCTTTT OCGAAGGATC CTTCQAAGCC 3000 
CTATGAAGTT TTTOXSACACSO ACXXCCACAQ GGAGGATTCT CAACaGGTTT TCC3U4AGACR 30 SO 
TGGATGAAGT TGACX3TGC3GG CTGCCGTTCC AGGCXX3AGAT GTTCATCCAQ AACGTTATCC 3120 
TGGTGTTCTT CTGTGTGGGA ATGATOGCftO GAOTCTTCCX! GTQOTTCCTT GTGGCAOTGO 3180 
6QCCCCTTGT CATCCTCTTT TCaGTCCTGC ACATTGTCTC CAOGOTCCTG ATTCGGGAGC 3240 
TGAAG C3GTC T GGACAATATC ACGCaGTCAC CTTTCCTCTC CCACATCAC3G TCC3VGCATAC 3300 
AOGGCCTTGC CACCATCCAC GCCTACAATA AAGGGCAGGA GTTTCTCCAC AGATACCaCG 3360 
AGCTGCTGGA TGACAACCAA GCTCCTTTTT TTTTGTTTAC GTGTQCGATG CGGTCGCTGG 3420 
CTGTGCGGCT GGACCTCATC AGCATCGCCC TCATCACCAC CAOGGGGCTG ATQATCGTTC 3480 
TTATGCACGG GCAGATTCCC CCAGCCTAT6 GGGGTCTCGC CATCrCTTAT GCTCTCCAGT 3540 
TAAOSGGGCT GTTCCAGTTT ACXX3TCAGAC TGGCATCTGA GACAQAAGCT CGATTCACCT 3600 
CJOGTOGAQAG GATCAATCAC TACATTMM3A CTCTGTCCTT GGAAGCACCT GCCAGAATTA 3660 
AGAAC3UM3GC TCCCTCCC3CT GACTGGCCOC AGGAGGGAGA GGTGACCTTT GAGAACGCAO 3720 
AGATGAGGTA CCQAGAAAAC CTCCCTCTTG TCCTAAAGAA AQTATCCTTC ACQATCAAAC 3780 
CTAAAGAGAA GATTGGCATT GTGGGQCOGA CAGGATCAGQ GAAGTCCTCG CTQGQQATGG 3840 
CCCTCTTCCG T CTGG TGGAG TTATCTGGAG GCTGCATCAA GATTQATGGA GTGAGAATCA 3900 
GTGATATT6G C CTTGO OGAC CTOOGAAGCa AACTCTCTAT CATTOCTCAA 6AGCXX3GTGC 3960 
TGTTCftGTGG CACTGTCAGA TCAAATTTGO ACXCCTTCAA CCAGTACACT GAAGAOCAGA 4020 
TTTGGGAT6C CCTGGAGAGG ACACACATGA AAGAATGTAT TGCTCAGCTA CCTCTGAAAC 4080 
TTGAATCTGA AGTGATGGAG AATGGG6ATA ACTTCTCAGT GGGGGAACGG CAGCTCTTGT 4140 
GCATAGCTAG AGCCCTGCTC CGCCACTGTA AGATTCTGAT TTTAGATGAA GCCACAGCTG 4200 
CCATGGACAC AGAGACAOAC TTATTGATTC AAQAGACCAT COQAGAAGCA TTTGCA6ACT 4260 
GTACCATGCT 6ACCATTGCC CATCGCCT6C ACAC3GGTTCT AQGCTCC3GAT AGGATTATOO 4320 
TGCT^CCCA GGQACAGGTG GTQQAGTTTO ACACCCCATC GGTCCTTCTG TCCAACGACA 4380 
OTTOCOQATT CTATGCCATG TTTGCTGCTG CAGAGAACAA GGT06CTGTC AAGGGCTGAC 4440 
TCCTCCCTGT TGACGAAGTC TCTTTTCTTT AGAQCATTGC CATTC3CCTGC CT6GGGCGGG 4500 
CCCCTCATCG CGTCCTCXnrA CCGAAACCTT GCCTTTCTCG ATTTTATCTT TOGC31CAGCA 4560 
GTTCCX3GATT GGCTTGTGTG TTTCACTTTT AGQGAOAGTC ATATTTTGAT TATTGTATTT 4620 
ATTC CATATT CATGTAAACA AAATTTAGTT TTTGTTCTTA ATTGCACTCT AAAAGGTTCA 4680 
GGGAACOGTT ATTATAATTG TATCAGAGGC CTATAATGAA GCTTTATACG TOTAGCTATA 4740 
TCTATATATA ATTCTOTACA TAGCCTATAT TTACAGTGAA AATGTAAGCT GrTTATTTTA 4800 
TATTAAAATA AGCACTGTGC TAATAACAGT GCATATTCCT TTCTATCATT TTTGTACftGT 4860 
TTGCTGTACT AGAGATCTGG TTTTGCTATT AGACTGTAGG AAGAGTAGCA TTTCATTCTT 4920 
CTCTAGCTGG TGGTTTCaCG GTGCCAGGTT TTCTGGGTGT CCAAAGGAAG AOGTCTX3GCA 4980 
ATAGTGGGCC CTCCGACAGC OOCCTCTGCC GCXnx:CCCAC AGCCGCTCCA G6GGTGGCTQ 5040 
OAQACGGGTG GGOGGCTGGA QACCATQCAG AQCGCOBTGA GTTCTCAGGG CTCCTGCCTT 5100 
CTGTCCTGGT 6TCACTTACT GTTTCTGTCA GGAGAGCAGC GGGGCGAAGC CCAGGCCCCT 5160 
TTTCACTCCC TCCATCAAGA ATGGGGATCA CAGAGACATT CCTCCGAGCC GGGGAGTTTC 5220 
TTTCX:TGCCT TCTTCTTTTT GCTGTTGTTT CTAAACAAGA ATCA6TCTAT CCACAGAGAG 5280 
TCCCACTGCC TCAGGXTCCT ATGGCTGGCC ACTGCACAGA GCTCTCCAOC TCCAAGACCT 5340 
GTTQQTTCCA AGCCCTGGAO CCAACTGCTG CTTTTTGAGG TGGCACTTTT TCATTTGCCT 5400 
ATTCCCACAC CTCCACAGTT CAGTGGCAGG GCTCAGGATT TCGTGGGTCT GTTTTCCTTT 5460 
CTCACOGCAG TCGTCGCACA C5TCTCTCTCT CTCTCTCCCC TCAAAGTCTC CAACTTTAAG 5520 
CAGCTCTTGC TAATCAGTGT CTCACACTGG OGTAGAAGTT TTTGTACTGT AAAGAGACCT 5580 
ACCTCAGGTT GCTGGTTGCT GTGTGQTTTG GTGT6TTCCC GCAAACCCCC ' m' tf rGCiVl' 5640 
GQflGCTGGTA GCTCAQOTGO GOSTOGTCAC TGCTQTCATC AGTTGAATGG TCAGC3STT0C 5700 
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ATGTCGTGAC CAA.CTAGACA TTCTGTCGCC TTAGCATGTT TGCTGAACAC CTTGTGGAAG 5760 

CAAAAATCTG AAAATGTGAA TAAAATTATT TTGGATTTTG TAAAAAAAAA AAAAAAAAAA 5820 
AAAAAAAAAA AAAAAAAA 

Seq ID NO: 3X9 Protein sequence; 
Protein Acceseion #t NP_005679 

1 U 21 31 41 51 

I i I I i I 

MKDIDIGKBY IIPSPGYRSV RBRTSTSC5TH RDREDSKFRR TRPIiEOQDAI. ETAARAEGLS 60 

LDASHHSQIiR ILDEBHPKGK YHH6LSALKP IRTTSKHQHP VDNAOLPSCM TPSWLSSLAR 120 

VAHKKGELSM EDVWSLSRHE SSDVNCStRLB RLHQBELNSV 6PDAASLRRV VWIFCRTRIiI 180 

LSIVCLMITQ LAGPSGPAPM VKHLLEYTQA TBSNLQYSLL LVLOLLLTEI VR6WSLA1.TW 240 

ALNYRTGVRX. RGAILTMAFK KILKLKNIKE KSLGELINIC SNDGQRMPEA AAVGSLIiAJSG 300 

PWAILGMIY NVIILGPTGF LGSAVPILFY PAMMPASRLT AYPRRKCVAA TDERVQKMNE 360 

VLTYlKPIiW VAWVKAPSQS VQKIREBERR lliEKAGYPQG ITVGVAPIW VIASWTPSV 420 

HMTLQPDLTA AQAPTWTVF HSMTPAIiKVT PPSVKSLSBA SVAVDRPKSIi PLMEEVHMIK 480 

NKPASPHIKI EMKNATLAWD SSHSSIQKSP KLTPKMKKDK RASRGKKBKV RQLQRTEHQA 540 

VliAEQKGHLL LDSDERPSPE EBEGKHIHLG HLRLQRTLHS IDLBIQBGKli VOICQSVGSa 600 

KT8LI8AIL6 QMTLLBOSIA ISGTFAYVAQ QAWIUIATLR DNILPGKEYD EBRYNSVWS 660 

CCLRPDLAIL PSSDIiTBZGB RQANLSGGQR QRISIiARALY SDRSIYIItDD PLSALDAHVG 720 

NHIPHSAIRK HLKSKTVliPV THQLQYLVDC DEVIFMKEGC ITERGTHEEL MNIiNGDYATI 780 

PNNLLLGBTP PVEINSKKBT SGSQKKSQDK GPKTGSVKKE KAVKPEBGQL VQLEBKGQGS 840 

VPW9VYGVYI QAAGGPLAFL VIMALFMLNV GSTAPSTMWL SYKIKQ6SGN TTVTRGNBTS 900 

VSDSMKDNPH MQYYASIYAL SMAVMLIIiKA IRGWFVKGT LRASSRLHDE LFRRILRSPM 960 

KFPDTTPTGR ILNRPSKDMD BVDVRI.PFQA EMFIQNVILV FPCVOIIAGV PPWFLVAVGP 1020 

LVlIiPSVLHI VSHVLIRELK RLDNITQSPF LSHITSSIQG lATXHAYNKG QEFLHRYQBL 1080 

LDDNORPFFL PTCAMRWLAV RLDLISIALl TTTGLMIVLM HGQIPPAYAO IiAISYAVQLT 1140 

GLFQFTVRIiA SBTBARFTSV BRINHYIKTL SLEAPARIKN KAPSPDWPQE (ffiWTPENABM 1200 

RVRENLPIiVL KKVSPTIKPK EKIGIVGRTG SGKSSLGMAL FRLVELSGGC IKIDOVRISD 1260 

IGLADLRSKL SIIPQEPVLP SGTVRSNLDP FNQYTEDQIW DAIiBRTHMKE CIAQLPLKLE 1320 

SEVMBM6DNF SVGERQLLCI ARALLRHCKI LILDBATAAM DTETDLIiIQE TIREAPADCT 1380 
MLTIAHRLHT VL6SDRIMVL AQGQWEFDT P5VLLSNDSS RPYAMFAAAE NKVAVKG 

Seq ID NO: 320 DNA sequence 

KUdeic Acid Acceesion #i AK022089.1 

Coding sequence I 181^1488 

1 11 21 31 41 51 

AGCAGTTGCA CAACTTCCAG CAACTTTCTC AGCXX3GCTAC TAATQAGCTG AAAGCCAGGA 60 

ACATCCGAGG AGAAGAGAAA GCTTCCAGCC CTCCTCCCTT CAC CCTG GAA ATCCAGACAC 120 

CCCCACCOCC ACOCTCftGAT CACTTTAAQA TAATTTCTTT ATTOSTXTGC OCGACAGACC 180 

ATCGCTCCCT TOGAAGAAA CTTQCTAAAO ACTC3SOC3«A AAAACAGATC TCCAACTAAA 240 

GACATGGATT CAGAA6AGAA GGAAATTOTG GTTTGGGTTT GCCAAGAAGA GAAGCTTGTC 300 

TGTGQGCTGA CTAAACGCAC CACCTCTQCT QATGTCATCC AGGCTTTGCT TGAGGAACAT 360 

GAGGCTAOGT TTGGAGAGAA AOGATTTCTT CTGGGGAAGC CCAGTGATTA CTG CATCATA 420 

QAQAAOrGGA GAG6CTCa3A AAGOaTTCTT CCTCCACTAA CTAGAATCCT GAAQCTmG 480 

AAAG0GTQ6G CAGATGAGCA GCGCAATAT6 CAATTTGTTT TGGTTAAAGC AGATCCTTTT 540 

CTTCCAQTTC CTTTGTGGCG GACAGCTGAA GCCAAATTAG TGCAAAACAC AGAAAAATTG 600 

TGGGAGCTCA GCCCAGCAAA CTACATGAAG ACTTTACCAC CAGATAAACA AAAAAGAATA 660 

GTCAGGAAAA CTTTCCGGAA ACTGGCTAAA ATTAAGCAGG ACACAQTTTC TCATGATCGA 720 

OATAATATGG A8ACATTAGT TCATCTGATC ATTTCOCAOO AOCATACTAT TCATCAGCAA 780 

GTCAAGAGAA TGAAAGAGCT GGATCT6GAA ATTQAAAAQT GTGAAGCTAA GTTCCATCTT 840 

GATCGAGTAG AAAATGATGG AGAAAACTAT GTTCAGGATG CATATTTAAT 6CCCAGTTTC 900 

AGTGAAGTTG AGCAAAATCT AGACTTQCAG TATGAGGAAA ACCAGACTCT GGAGGACCT6 960 

AGCGAAAGTG ATGGAATTGA ACAGCTGQAA GAAOQACTGA AATATTACCG AATACTCATT 1020 

GATAAGCTCT CTGCTCAAAT AGAAAAA6A6 GTAAAAAGTO TTTGCATTGA TATA AATGAA 1080 

GATG06GAA0 GGGAAGCTGC AAGTGAACT6 GAAAGCTCTA ATTTAGA6AG TGTTAAGT6T 1140 

GATTTGGAGA AAAGCATpAA AGCTGGTTT6 AAAATTCACT CTCATTTGAG TQQCATCCAG 1200 

AAA6AGATTA AATACAGTGA CTCATTGCTT CAGATGAAAG CAAAAGAATA TGAACTCCTO 1260 

GCCAAGGAAT TCAATTCACT TCACATTAGC AACAAA6ATG GGTQCCAGTT AAAG6AAAAC 1320 

AOAGCGAAGO AATCTGAGGT TCCCAGTAGC AATGGGGAGA TTCCTCCCTT TACTCAAAGA 1380 

OTATTTAGCA ATTAC3VCAAA TGACACAGAC TCGGACACTQ GTATCAGTTC TAACCACAGT 1440 

CAGGACTCOG AAACAACAGT AGGAGATGTG GTOCTGTTGT CAACATAGTT CCA ATGGC TC 1500 

CTTTCTGACC TGCTTTCATG TTTTAATGTT TGTTTAATTT AATAGGAAAC CTCATTTTAA 1560 

ATATAACACT CAAAAAAATG TAAATCATAT TGTAGTATTC AATAGTTAAT AAAAACTG8A 1620 
GAAATGTGTT OTTTCTO 

Seq ID NO: 321 Protein sequence: 
Protein Accession #: NP_005438.1 

1 11 21 31 41 51 

ilAPFGRNLLK TRHKNR9PTK DMDSEEIGEIV VWVOQEEKLV CGLTKRTTSA DVIQALLEEH 60 

EATFGEKRPL LGKPSDYCII EKHRSSERVL PPLTRILRLN KANGOEQPHM QFVLVK ADAP 120 

LPVPLWRTAB AKLVQMTEKL WELSPANYMK TLPPDKQKRI VRKTPRKIiAK IKQDTVSHDR 180 

DNMBTLVHLI ISQDHTIHQQ VKRMKELDLE lEKCEAKPHL DRVENDGEMY VQDAYLMPSF 240 

SEVBQNLDLQ YEENQTLEDL SESDGIEQLE ERLKYYRILI DKLSAEIEKE VKSVCIDINE 300 

DAEGBAASEL ESSNLBSVKC DLEKSMKAGL KIHSHLSGIQ KBIKYSDSLL QMKAKEYELL 360 

AKEPNSUUS NKDGOQIiXEN RAKESEVPSS MGBIPPPTQR VPSNYTNDTD SDTGISSNB8 420 
QDSETTVGDV VbLST 

Seq ID NO: 322 DMA sequence 

Nucleic Acid Accession #i nk_030920.1 
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coding sequence: 317-1123 

1 11 21 31 41 51 

i ) 1 I I I 

AGCATTGAAG GGGAAGGAAC TGCGGGTGTG GTGTOTGTAT GTGTGTGTGT ATGT6TGTGC £0 

GGCGCGTGCG TGCGTGTGTG TGCGCGCGCT AGTGTGT6GA CAAGQAGGTG GGGGCAGCTG 120 

AGTTAGAGTC CCAACTCTTG OACTCCATTT GCTATTCTCT TCTTTCTCCC (XACACCTAT 180 

CTGGTGGTG6 TAGT6GGGQT TTATATTTGC GTTCCTTTTC ATTCATTTCT AAATCTCTTA 240 

AAAATTTTGG GTTGGGGGTA TTGOGGAAGQ CAGOAAAGGG AAAAGGAGAG TAGTAGCTGA 300 

AGAGCAAiSAG GAGGACATGG AGATGAAGAA GAAGATTAAC CTGGAGTTAA GGAACAGATC 360 

CC0GGAGGA6 GTGA CAGA GT TAGTCCTTGA TAATTGCCTG TGTGTCAAT6 GGGAAATT6A 420 

AGGCCT6AAT GATACTTTCA AAGAACTAGA ATTTCTGIAGT ATGGCTAATQ T6GAACTAAG 480 

TTCX3CTG0CC 06GCTTCCCA GCTTAAATAA ACTTCGAAAA TTOGAGCTTA 6TGATAATAT 540 

AATTTCTQGA GGCTTGGAAG TCCTGGCAGA GAAATGTCCA AATCTTACCT ACCTCAATCT 600 

QAGTGGAAAC AAAATAAAAG ATCTCAGTAC AGTAQAAGCT CTGCAAAATC TTAAAAATTT 660 

GAAAA GTCTT GACCTGTTTA ACTGTGAGAT CACAAACCTG QAAGATTATA GAGAAAGTAT 720 

TTTTGAACTA CTGCAGCAAA TCACATACTT AGATGOATTT GATCAGGAGG ATAATQAA6C 780 

6CC6GACTCT GAAGAGGAGG ATGATGAGaA TGOAGATGAA 6ATGATGAA6 A66AAQAGGA 840 

AAATGAAGCT (^STCCACCGG AAGGATATGA GGAAGAGGAG GAGGAAGAGG AAGAGGAGGA 900 

TGAGGATGAG GATQAAGATG AAGATGAAGC AGGTTCAGAG TTGGGAGAGG GAGAAGAGQA 960 

AGTGG6CCTC TCATACTTAA T6AAAQAAGA AATTCAGGAT GAAGAAGATG ATGATGACTA 1030 

TGTTGAAGAA GGGQAAGAAG AGGAAGAAGA GGAAGAAGQA GGTCTTCiGAG GGGAGAAGAG 1080 

GAAACGAGAT GCTGAAGACG ATGQAGAGGA A0AAGAT6AC TAGATCATTC TAAGACCAQA 1140 

TTCTCTAATG TTTCTGGGTG T6CAATAGAG TGATCACATC TTTOTTTCTT CATGTACGAT 1200 

AGCTATCCCT ACAGAAGATA ATGTGTAACT TTTTATAGGA AAAGTGTGGT TTTACTATTT 1260 

TTGCCTTATC ATTCCAAATA AGAACTA6TC TGTTAATGAT CATATTGTAT GTAGAGAAAA 1320 

ATTTTCATTG ACTCCCATTG TGGAATTCX:C TAGCAATTTA TTTAGACTTA ATTTTTTAAA 1360 

TTCAAGCTTA CTGTATTAGT CATTTTTAGC CCATAATTAA AACATGATCA CTTTTAAACA 1440 

GGTGTAGTAT GGT6CATTTC ATTCCTTATT TATAGATTAA CTGAAATTAC AGTTTGCTAT 1500 

AATATAAAAT QACAATAGTC TCTTGAGTGG TAAGTTGGTT ATTTTTTTAG AGGTGATCCA 1560 

GGA ATCTT TA GTTT6AAGGC AGTTACCTTT TTTTTTTTTT TTTTTTTTTG ACTAAGAGTG 1620 

TTTGGTTGCT TTTTTGTCAC AAGTAACTTG GAAAATAGAA GCAGAATAGT AAAGQTTCTA 1680 

TTCAGCAACA TAGTTCATGG ATTTTGTGGA G6TTCTATTC AGTAATATGG TTCATOGATT 1740 

TAGTGGTGAC TGATAAGATT TTATTTTTGA AGGAAAAATT GCTTATACTA AGTCCRGAGA 1800 

C ATGCA OaTG AGCCCTTTTO TCAGGCTGCA AATCATGACA TGCCQATGGT TGTTTATTTT 1860 

OTTTTTAQGT GTGCATTCTT TTTCTTCTTA GCAATTCCTT TATGATCACC TTCCCTTCTT 1920 

GTTTCACTCC CTCCOGCTCT CTCAAAAGGA ACTT6GGAAA CTTQTGAAAC CCAGGAAAAC 1980 

OTTTAQTCTT AT ACCTC AAC TACGTTTCAG TCCTGTCTGG GTTTTAAATA AGTGAAGTAG 2040 

AAGAAATTGA GTATTTTCTG ACATAAGAAT ATATTATCAA TAGAGTTTTA TGCAGTAAGC 2100 

TCTCCTTACC ATAAATQTTT CTTGOTTQAC AACATCTAAQ ACAATATTA6 TGGQAT6AAG 2160 

AAAGAAAAGC AGGGGTGCTT TTG6AAGCAG TQTTAGTGTT CCTCAAAAGT CGGAACAATT 2220 

GCCTGTT6AT ATATTAATAA GACATTAAAG TCAAATTTTA ATGTTGGCCT CTCAAATGAT 2280 

TTGGATACCA CTCTGCAAAG TATTTCTAAC CTTTAATTCC CAGTTTTAAA ACAGATATAA 2340 

TAATA6CAT7 rAATTGQAAT ATACTAOGCA GCTGGAAAAG TATTT6AAAC TAAATTGACA 2400 

TTAAAATTAA GATTTGTTTT CAA6TGGATG TCCATTAAAA GTAQAAAAAT ATTTOGSATA 2460 

AGTGAGTGtG TOTTTCCTTA CATGGCTACT AAATAAAATA TAATGA6TAT ACAAGTATAT 2520 

CTCCT CTTTT GCTATGQAGG CTCCATGTTC AAGGCAATGG CTTTTTAAAT CTTGGCTATC 2580 

TAAAATTTTT TCCCTTTGTT TTGAATATTT GTAAGTTTTT AAGAAGTTAG TGTCAGCAAA 2640 

TTAATTGAAG TTATGCTTCT ATACTGGGAC ATATTTAAAT ACT6AGTATA GTACT6CTGC 2700 

TACTGCTTCT ACAATGTAAA ATGTATGACT TGOTOTTTTA AAGTAAAAAT TATGATQTTA 2760 

CTTGTGGAGA AAGTAAAAAT GTTGTACAAC TGACC6AAAG AAAACCCTT6 GGGATAAGTT 2820 

TAGTGAGGGG ATTGQAATCC CCAAAAA6AT AACATTTTTC TTCTGCTTTT AAAAACTGAA 2880 

ATTCOC TgTT CTAGTTCCTA ACAATTCTCA TTACATACTA TG0CA6ATTA CAAAATACTT 2940 

ATTTTTAAAA TGAAATCTAT ATATTGACTT TCTTATCAAT CATCTTACTG TGCAATCAAA 3000 

ATTAGAGTAC TTTGGTTTGA AAACAACACT TAGAGCCTCC AGATAACTTT TAAOACTTAT 3060 

TTAGCTTTGT GGGTGGTATT TTCATGCAAA TAAGTAAGGG T(X3GTTTTAT ATTTTGTAGA 3120 

AGTTTTCGGT CCTATTTTAA TGCTCTTTGT ATGGCAGTAT GTATATATTG TGTTAAGTTC 3180 

CTCAAGAATC TOCTTAAAAA CTTTGAAGTT AAXACTTTTG TGCAACTOTG TTTT6AATAA 3240 
A6CCATGACA GTGTTAAAAA CAAAC 

Seq ID NO; 323 Protein sequence: 
Protein Accession #i NP_112182.1 

1 11 21 31 41 51 

I i t 1 I I 

MEMKKKINLE LRNRSPEEVT EIiVLmCliCV NGEIEGLNDT FKELEFLSMA NVELSSLARL 60 

PSLNKLRKLE LSDNIXSGGL EVLAEKCPNL TYXiNLStanU KDLSTVEALQ NLKNLKSLDL 120 

ENCTITHLED YRBSIFEIiLQ QITYLDGFDQ EDNEAPOSEE EDDEDGDEDD EEBEENEA6P 180 

PEGY^^EEE BEBEDEDEDE DEDBA6SBLG EGEBBVGLSY IMKEEXQDEE DDSDYVEEGB 240 
EEEEBEEGGL R6BKRKRDAE DDGEEEDD 

seq ID NO: 324 DNA sequence 
Kucleic Acid Accession #: MM_003B12 
Coding sequence: 224.. 2722 " 



1 11 21 31 41 51 

I i I 1 I ) 

TCCTCTGCGT CCCGCCCCGG GAGTG6CTGC GAG6CTAQGC GAGCCGGGAA AGGGGGCGCC 60 

GCCCAGCCCC GAGCCCCGCG CCCCGTGCCC CGAGCCCGGA GCCCCCTGCC CG0GG0G6CA 120 

CCATG0G06C CGAGCGGGOG TGACOSGCTC CGCCGGCGGC CGCCCCGCAG CTAGCCCGGC 180 

GCTCTOGCCG GCCACAOGGA GOGGOQCGGG GGAGCTATGA GCCAtGAAGC CGCOCGGCAG 240 

CA6CTCGCGG CAGCCGCCCC TGGCQGGCTQ CAGCCTTGCC GGCGCTTCCT GOGGCCCCCA 300 

AOGCGGCCCC GCCGGCTCGG TGCCTGCCA6 CGCCCOGGCC CGCAOQCCGC CCTGCCGCCT 360 

GCTTCTCGTC CTTCTCCTGC TGCCTCCGCT CGCCGCCTCG TCCCGGCCCC GCGCCTGGGG 420 

GGCTGCTGCG CCCA6CGCTC CGCATTGGAA TGAAACTGCA GAAAAAAATT TGGGAGTCCT 480 

GGCAGATGAA GACAATACAT TGCAACAGAA TAGCAGCAGT AATATCA6TT ACAGCAATGC 540 

AATGCAGAAA GAAATCACAC TGCCTTCAAG ACTCATATAT TACATCAACC AAGACTOQGA 600 
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AAGCCCTTAT CACGTTCTTO ACACAAAGGC AAOACACCAO CAAAAACATA ATAAOGCTGT 660 

CCATCTGGCC C3U3GCAAQCT TCCAGATTGA AGCCTTCGGC TCCAAATTCA TTCTTGACCT 720 

CATACTGAAC AATGGTTTQT TQTCTTCTOA TTATGTGGAG ATTCACTACG AAAATGGGAA 780 

ACCACAOTAC TCTAAGGOTG GAGAGCACTG TTACTACCAT GGAAGCATCA GAGGOOTCAA 840 

AOACrCCAAO aTGGCTCTOT GAACCTQGAA TGGACTTCAT G0GATGTTT6 AAGATGATAC 900 

CTTOSTGTAT ATGATAQAGC CACTASA6CT GGTTCATGAT 6A6AAAAGCA CAGGTa3ACC 960 

ACATATAATC CAGAAAACCT TGGCAGQACA GTATTCTAAG CAAATGAAGA ATCTCACTAT 1020 

GQAAAQAGGT GACCAGTGGC CCTTTCTCTC TGAATTACAG TGGTTGAAAA GAAGGAAGAG 1080 

AGCAOIOAAT CCA.TCAC6TG GTATATTTGA AGAAATGAAA TATTTGGAAC TTATQATTGT 1140 

TAAT6ATCAC AAAA06TATA AGAAOCATGG CTCITCrCAT GCACATACCA ACAACTTTQC 1200 

AAAGTC03T6 GTCAACCTTG T6GATTCTAT TTACAAGORG CAGCTCAACA CCAGOOTTOT 1260 

CCTGGTCGCT GTAGAGACCT GGACTGAGAA GGATCAGATT GACATCACCA CCAACCCTGT 1320 

GCAQATGCTC CATGAGTTCT CAAAATACCG GCAGCGCATT AAGCAOCATG CTQATGCTQT 1380 

6CACCTCATC TGGGGGGTOA CATTTCACTA TAAGAGAAGC AGTCTQAGTT ACTTTOGflflQ 1440 

TGTCTGTTCT OQCACAAGAO GA6TTG0TGT GAATQAOTAT GOTCTTCC3UI TG6CAGT6GC 1500 

ACAAGTATTA TQ6CAGAGCC TGGCTCAAAA CCTTG6AATC CAATGGGAAC CTTCTAGCAG 1S60 

AAAGCCAAAA TGTGACTGCA CAGAATCCTG GGGTQGCTGC ATCATGGAGG AAACAGGGGT 1620 

GTCCCATTCT CGAAAATTTT CAAAGTGCAQ CATTTTGOAG TATAQAGACT TTTTACAGAG 1680 

AGGAGGTGGA GCCTGCCTTT TCAACAGGCC AACAAAGCTA TTTGAGCCCA 0GGAAT6TGG 1740 

AAATGQATAC GTG3AAGCIQ GGGAGOAGTG TGATtOTOST TTTCATGTGG AATGCTATG6 1800 

ATTATOCTGT AAGAAATOTT CCCTCTCCAA COGGGCTCAC TGCAGOGAOG GGCCCTGCTG 1860 

TAACAATACC TCATGTCTTT TTCAGCCACG AGGGTATGAA TGCCX3GGATG CTGTGAACX3A 1920 

GTGTQATATT ACTGAATATT GTACTGGAGA CTCTGGTCAG TGCCCACCAA ATCTTCATAA 1980 

GCAA6A066A TATGCATOCA ATCAAAATCA GGGCCX3CTGC TACAATGGC3G AGTGCAAOAC 2040 

CAOAGACAAC CAGTGTChGT ACATCTGGGG AACAAAG6CT GCA6G6TCTG ACAAGTTCTG 2100 

CTA7GAAAAG CTGAATACAG AAGGCACTGA 6AAGGGAAAC TG06GQAAGG ATGGAGACOG 2160 

GTGGATTCAG TGCAGCAAAC ATQATGTGTT CTOTGGATTC TTACTCTQTA CCAATCTTAC 2220 

TOGAGCTCCA CGTATTGGTC AACTTCAGOG TGAGATCATT CCAACTTCCT TCTACCATCA 2280 

AQGCCGGGIG ATTGACTGCA GTG6T6CCCA TGTAGTTTTA QATGATGATA CGGATQTGGG 2340 

CTATGTASAA GATGGAAOGC CATGTGOCCC GTCTATQATG TGTTTAGATC GQAAGTGCCT 2400 

ACAAATTCAA GCCX:TAAATA TGAGC!A6CTG TCCACTCGAT TCCAAGGGTA AAGTCTGTTC 2460 

-GGGCCATGGG GTGTOTAGTA ATOAAGCCAC CTGCATTTGT GATTTCACCT GGGC3«3G6AC 2520 

AGATTGCAGT ATCCGGGATC CAGTTAGGAA CCTTCACCCC CXXaAGGATG AAGGACCXiaA 2580 

GGGTCCTAGT GCCACCAATC TCATAATAGG CTCCATCGCT GGTGCCATCC TGGTAGCAGC 2640 

TATTOTCCTT GGQGGCACAG GCTGGGGATT TAAAAATGTC AAGAAGAGAA GOTTCOATCC 2700 

TACTCAGCAA GGCCCCATCT GAATCAGCTG OGCTGGATGG ACACCGCCTT GCACTGTTOO 2760 

ATTCTGGGTA TGACATACTC GCAGCAGTGT TACTGOAACT ATTAAGTTTO TAAACAAAAC 2820 

CTTTGGGTGG TAATGACTAC GGAGCTAAAG TTGGGGTGAC AAfiSATGOGG TAAAAGAAAA 2880 

CTGTCTCTTT TGGAAATAAT GTCAAAGAAC ACCTTTC3VCC AGCTOTCAGT AAAOGGGGGA 2940 

GGGG6CAAAA GACCATGCIA TAAAAAQAAC TGTTCCAGAA TCmTTTTT TCCCTAAIOO 30O0 
AOGAAGGAAC AACACACACA CAAAAATTAA ATGCAATAAA Q6AATCATTA AAAA 

Seq ID NOt 325 Protein sequence: 
Protein Accession #: NP_003803 

I 11 21 31 41 51 

i { I I i 1 

MKFPGSSSRQ PPLAGCSLAG ASOGFQRGFA 6SVPASAFAR TFPOUiIiLVL LLItPPLAASS 60 

RPRAWGAAAP SAPHWNBTAE KKLGVLADED NTU2QNSSSN ISY6HAHQKE ITCiPSRIiXYy 120 

INQDSESPYH VLDTKARHQQ KHHKAVHLAQ ASFQIBAFGS KPILDLILNN QhLSSDTVEl 180 

HYENGKPQYS KGGBHCYYHG SIRGVKDSKV ALSTCNGLHG MFEDDTFVYM IEP LEL\m DE 240 

KSTGRPHIIQ KTLAGQYSKQ MKNIiTMBRGD QWPPLSEWJW LKRRKRAVNP SRGIFEBOCy 300 

IiBLNXVNDHK TYKKKRSSHA HTHNFAKSW NLVDSIYKBQ UITRWLVAV EIHTEKDQXD 360 

ITTNPVQMIiH EFSKYRQRIK QHADAVHIiXS RVTPHYKRSS LSYFOGVCSR TRGVGVNBVG 420 

LPMAVAQVLS QSLAQNLGIQ WEPSSRKPKC DCTSSWGGCI MEBTGVSHSR KFSKCSILEY 480 

RDFLQRGGGA CLFHRPTKLF EPTBOGNGYV EAGEECaXGF HVECYGLCCK KCSLSNGAHC 540 

SDGPCQ9NTS CLFQPRGYEC RDAVNECSIT EYCTGDSGQC PPNLHKQDGY ACNQMp GRCY 600 

NGECKnoaiQ OQYIWGTKAA GSOKFCYBKL HTBGTBKGIIC OKDGDRWIQC SfEHDVFOGFI* 660 

ItCTNLTRAPR ZG0LQ6EIIP TSFYHQGRVI DCSGAHWLD DDTDVGWED GTP06PSMMC 720 

LORKCLQIQA LNMSSCPLDS K6KVCS6BGV CSNBATCIO} FTNAGTDCSI ROPVRNItHPP 780 
KDBQPXGPSA TNX>IZ6SIAG AILVAAZVLG GTGWGFKNVK KRRFDPTQCX3 PI 

Seq ID NOt 326 ONA sequence 
nucleic Acid Accession #t AK074418.1 
Coding sequence t 244-1515 

1 11 21 31 41 51 

I I I I I I 

CTTTCTCCAA GA06GCGGGC CATGCTCTOC TCCTCTGCCA GTCTCCTCCA CCACTCTCTA 60 

ACCTGA6AGC CTGTGGAACC TGCCCGTCTC CCCTCCTCCA TC^GACACAC CIGCCTAGGA 120 

AACAGATGGA AAAAGTGAGG GACCGGTGA6 TGACTTGCTG CTAAAGTTTA TACCAGATGC 180 

AAATGACAGA GCTGGAGTTC TGCTGT6CCT GGAAAGGACC TOGGAAGTCT TCTAAQGAQA 240 

GTCATGGOGT ATTACCAGOA GCCTTCAaTG GAOACCTCCA TCATCAAGTT CAAAQACCAQ 300 

GACTTTACCA CCTTGCGGGA TCACTGCCTG AOCATOGGCC GGACOTTTAA GGATQAGACA 360 

TTCCC06CAG CAGATTCTTC CATAGGCCAG AAGCTGCTCC AGGAAAAACG CCTCTCCAAT 420 

G1X3ATAIGGA AGCGGCCACA GGATCTACCA GGQGGTCCTC CTCACTTCAT CCTGGATGAT 480 

ATAAGCAGAT TTGACATCCA ACAAGGAQGC GCAGCTGACT GCTGGTTCCT G6CAGCACTG 540 

GGATCCTTQA CTCAGAACCC ACAGTACAOG CAGAAGATCC TGATGGTCCA AAGCTTTTCA 600 

CAGCAOTATO CTGOCATTTT CCGTTTCCQG TTCTGGCAAT GTGGCCAGTG GGTGGAAGTG 660 

GTGATTGATG ACCGCCTACC TGTCCAGGGA GATAAATGCC TCTTTGTGCG TCCTCGCXAC 720 

CAAAACCAAG AGTTCTOGCC CTGCCTGCTG GAGAAGGCCT ATGCCAAGCT GCTCXSGATCC 780 

TATTCOQATC TGCACTATGG CTTCCTCGAG GATGCCCTGG TGGACCTCAC AGGAGGCGTG 840 

ATCACCAACA TCCATCTGCA CTCTTCCCCT GTGGACCTGG TGAAGGCAGT GAAGACAGCX3 900 

ACCAAGGCAG GCTCCCTGAT AACCTGTGCC ACTCCAAGTQ GGCCAACAGA TACAGCACAQ 960 

6CGATGGAGA AtGGGCTGGT GAGTCTCCAT GCCTACACTG T6ACT6G6GC TGAGCAGATT 1020 

CAATACCGAA GGGGCTGGGA AGAAATTATC TCCCrOTGOA AC00CTG6Q6 CIGG66GGA8 1080 

ACGGAATGGA QAGGGC6CTG GAGTGATGGO TCTCAOGAGT GGGAQGAAAC CTGT8ATC0Q 1140 
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CGGAAAAGCC AGCTACATAA GAAACGQGAA 
TTCCAACAGA AATTCATCGC CATOTTTATA 
GGAAACACAC TCCACGAAGG ATGGTCCCAA 
AACACTGCAG C3AGGACCTCQ GAATQATGCT 
GAAGGCACCft ATCTTGTC6T QTGCGTCACA 
GAAGATGCAA AATTTCCACT CGATTTCCAA 
CCAAAGCTCA AATAATAAAT TCCX3CCGCAA 
6AACTATGTT GTGGTTGCAC AGACACGGAG 
CXn'GAAAATO CCAGACASTG ACAGGCACCT 
AAGCCCTTCA G^ACATGQCT CCCAACAAAG 
GTACCTAGCA CXCAGGGGCC TTAOSTGGGA 
CCCTCACAGG CCCTTACTGG GAT6CAGAGA 
GCCTCTCTTC CTGGATCGTC TCCAGAACTG 
C6CCCCACCC A6TCTCATCC GGGGGACTTC 
GGAZAATTAT G0GGTOT6A6 GTQCATTGCC 
ACCCCGTGAA ACCTrTCCTT CTCCTACTOG. 
CCXX3GGAGCT AGCCAGCTTC AGAAAGCACA 
TGCACACAGG ATITCCTTAA TGGCTTAATA 
GAATAAAATA GCTGCCAGGG 6CTCTGCACA 
AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 

Seq ID HOi 327 Protein sequence i 
Protein Accession ff: BABe5075.1 



GATGGC6AGT 
TGTAQCGAAA 
ATAAT6TTTA 
CAATTCAACT 
GTTGCTGTCA 
6TGATTCTGG 
CTTCACCAT6 
AAAATCAG06 
GAGCA6CCAT 
CATTTTCAAC 
TTGGAGAAA6 
GOAGAAGTGA 
CTQTGGCTGC 
AAGCTGGAAT 
CTCTAAATCT 
GCCACCTCOC 
TACAGCATCC 
AACTGTTATA 
ATGAGCCTCT 



TTTG6ATGTC 
TTCCAATTAC 
GGAAGCAAOT 
TCTCTGTGCA 
CACCATCAAA 
CTGGCTCACA 
ACTTACCATC 
GAGTTCTTGC 
TTCAACCTCA 
AGATATGCTC 
GGGACCTGAG 
CTTGATGGAC 
CAAGCTCGGT 
GCAGAGCTTA 
TTAAACAAGC 
ACCAACCTGG 
TTGCTGCCAA 
AAGAACTCCT 
TACCGTTAAA 



6TGTCAAGAT 
CCTGGACQ^T 
GATTCTAGGA 
AGAGCCAATG 
TTT6AAAGCA 
GAAACACTGT 
TGA6CCCTGG 
TCCGAATCTT 
GAATGAAGGG 
AGCAGGTATG 
GGAGGGACAG 
TATTTTACCT 
AGAGACGTGG 
GAAAGGGAGG 
AATTGGCAGT 
CATCGTTCCT 
ACCACCTATG 
TGACTTGTCA 
AAAAAAAAAA 



1 

I 

MAYYQEPSVE 
IWKRPQDLPG 
QYAGIPRPRF 
SDIjHYGFIiED 
MENGLVSIiHA 
KSQLHKKRED 

KLK 



XI 

I 

TSIIKPKDQD 
GPPHFILDDI 
WQCGQWVEW 
ALVDI.TGGVI 
YTVTGAEQIQ 
GEPHMSCQDF 
PNPSVQSPNE 



21 
I 

FTTLRDHCLS 
SRFDIQQGGA 
IPDRLPVQGD 
THIHI£SSPV 
YRRGMEBIIS 
QQKPIAKFIC 
OTNVWCVTV 



31 



41 



M6RTFKDETF 
ADCWFliAALG 



DLVKAVKTAT 
LWNPW3N0ET 
SEIPXTLDBG 
AVTPSNUCAE 



I 

PAADSSIGQK 
SLTQNPQYRQ 
NQEFHPCLLE 
KAGSItlTCAT 



51 



NTLBESHSQI 
J3AKFPLDFQV 



i 

LLQEKRXiSNV 
KILMVQSPSH 
KAYAKLLGSY 
P6GPTDTAQA 
QENESTCDPR 
MFRKQVXLGH 
ILAGSQKHCP 



Seq ID KO: 328 DUfA sequence 

Nucleic Acid Accession tt BC017490.1 

Coding sequence: 74-2788 



I 

6T6G6TCACG 
TOGTGGTACT 
GOGTCGGCQA 
TGCCCTCACC 
CXHAGGCACA 
C ATGG AARGG 
TCTGGATOAT 
CATGGGGCAG 
GTATGACA6C 
GGCCACG6AG 
TCTCAAAGGC 
CCACCGCTTC 
GGAGCGCATC 
CTTGGCAGCC 
GCAGATCTTT 
CATCACCAAC 
GCTGAGGCAG 
TGG0GTCCT6 
aOGTCCTTTC 
GTCGGCGQGC 
COGAATCC3W3 
CATTCTCCTC 
CATCTATCAC 
CACTGTCATC 
6AC0GA7GAA 
GATCTTTGCC 
TCTGGCCCTG 
TATCAACGTG 
TGA6AAAGTG 
CSICQGCQTAT 
GGTTCTGGCT 
CAQAACCAGC 
CX3TCACCTCC 
CTACX3ACCCC 
CTTTGACATC 
COGCTTOGTG 
G6CCAATGOC 
CXAGGAGGTC 
CCAGATGGAC 
GACAGGCAGC 
CCACGC6C6C 
OGTQATGCTG 
GACTTTT6CC 
6AAGCA6TTA 
CACTATTQAG 



11 
I 

TGAACCACTT 
6CTATG6CG6 
GGCAATGATC 
TCCAGCCCTG 
GAGGGGCCCC 
GftCTACCGGG 
GAOGAGGTAG 
CGTGACOGGG 
GATGAGGAGG 
GACGGCGAGG 
CACTCTGTGC 
AAGAACTTCC 
AG06ACATGT 
AGGQAGCACXS 
GATGAG6CTG 
CACATCCATG 
CTGCATCTGA 
CCCCAGCrCA 
TGCCAGTCXTC 
CCCTTTGAGG 
GAGAGTCCAG 
GCAGATCTGG 
AACAACTATG 
CTAGCXIAACC 
GAUGTGAAGA 
AGCATTGCTC 
TTCGGAGGGG 
CTCTT0TGC3G 
TCCAGOGGAG 
OTCCAGOGGC 
6ACCGAGGAG 
ATCCATGAGG 
CTGCAGGCTC 
TC3GCT6ACTT 
CTGTGTGTGG 
GTGGGCA6CC 
A606CTGCTG 
CTQAAGAAQT 
CAG6ACAAG6 
ATCCCCATTA 
ATCCATCTGC 
GAGAOCTTCA 
CGCTACCTTT 
GTGGCAGAjQC 
GTCCCTGAGA 



21 
I 

TTGGOGGGAA 
AATCATCG6A 
CTCTCACCTC 
GCCX3TGACCT 
TGGAGGAAGA 
CCATCGCAGA 
AQ6AGCTGAC 
AGGCTGGCCG 
ACGAGGAGCQ 
AGGACXSAGGA 
GGQAGTGGGT 
TQOGCACrCA 
GCAAA6AGAA 
TGCTQGCCTA 
CCCTGGAGGT 
TCCGCATCTC 
ACCAGCTGAT 
GCATG6TCAA 
AGAACCAGGA 
TCAACATGGA 
GCAAAGTGGC 
TGGACAGCTG 
ATGGCTCCCT 
AOGTGGCCAA 
TGATCACTAG 
' CTTCCATCTA 

agcccaaaaa 
ga6accx:tgg 

CCATCTTCAC 
accctgtca6 
TGTGTCTCAT 
CCATGGAGCA 
GCTGCACX3GT 
TCTCTGAGAA 
TGAGGGACAC 
ACX3TCAGACA 
AGCCCGCCAT 
ACATCATCTA 
TGGCCAAGAT 
CGGTGCX>GCA 
GGGACTATGT 
TAGACACACA 
CATTC06GCX3 
AGOTGACATA 
AGGACTTGGT 



31 
I 

ACCTX3GTTGT 
ATCCTTCACC 
CAGCCCTGGC 
TCCACCATTT 
AGAG6ATGGA 
GCTGGAOGCC 
GGOCAOTCAG 
GGGCCTGGGC 
CCCTGCCCGC 
6AT6ATC6AG 
OAGCATGGCG 
OSTGSACAaC 
COSTGAGAGC 
CTTCCTGCCT 

g6tactggcc 
ccacctgcct 
ccgcaccagt 
gtacaactgc 
ggtgaaacx:a 
ggagaocatc 
ggctggcosg 
caagccagga 
caacactgcc 
gaagqacaac 

CCTCTCCAAG 
TGGTCATGAA 
CCCAGGTGGC 
CACAGOSAAG 
CACTGGCCAG 
GAGGGAGTGG 
TGATGAATTT 
ACAGAGCATC 
CATTCCTGCC 
GGTGGACCTC 
COTGGACCCA 
CCACCCCAGC 
GCCCAACACG 
CGCCAAGGAG 
GTACAGTGAC 
CATCX5AGTCC 
GATC6AAGAC 
GAAOrrCAGC 
TGACAACAAT 
TCAOGQCAAC 
GQAZAAGGCT 



41 
I 

TGCTGTAGTG 
AT6GCATCCA 
CGAAGCTCCC 
GAGGATGAGT 
GAGGAGCTCA 
TATQAGGCG6 
AGQGAGGCAG 
CQCATGCGCC 
AAGOGCGGCC 
A6CATC6AGA 
G6C0C0G6GC 
CA0S6CCACA 
CTGGTGGTGA 
GAGGCACCGG 
ATGTACCCXA 
CTG6T6GAG6 
GGGGIGGTGA 
AACAAGTGCA 
GGCTCCTGTC 
TATCAGAACT 
COXSCCCCGCT 
GACQA6ATA6 
AATGGCTTCC 
AAGGTTGCTG 
QATCAGCAGA 
GACATCAAGA 
AAGCACAAGG 
TCGCAGTTTC 
GGG6C6T0GG 
ACCTTGGAGG 
GACAAGATGA 
TCCATCTCGA 
6CCAACCCCA 
ACAGAOCCCA 
GTGCAGGAOG 
AACAAGGAGG 
TATGGCGTGG 
AGGGTCCACC 
CTGAGGAAAG 
ATX5ATCCGCA 
GAC6TCAACA 
GTCATGCGCA 
GAGCTGTTGC 
CGCTTTGGGQ 
GGTCAfiATCA 



51 
I 

GCGGAGAGGA 
GCCCGGCCCA 
GG06TACTQA 
CCX3AGGG6CT 
TTGGAGATGG 
AGGGACTGGC 
CAGAGOGGGC 
GTGGGCTCCT 
AGGTGGAGC6 
ACCTGGAGGA 
T6GAGATCCA 
ACGTCTTCAA 
ACTATGAGGA 
OGGAGCTGCT 
AGTAC6ACCG 
A6CTGC6CTC 
GCAGCTQCAC 
ATTTOOTOCT 
CTGAGTGCCA 
ACCAGGQTAT 
CCAAGGACGC 
A6CTGACTGG 
CTGTCTTTGC 
TAGGGGAACT 
TCGGAGAGAA 
GAGGCCTGGC 
TACGTGGTGA 
TCAAGTATAT 
CTGT6GGCCT 
CTQGGGCCCT 
ATGACCAGGA 
AGGCTGGCAT 
TAGGAGGGCG 
TCATCTCAOG 
AGATGCTGGC 
AGGAGGGGCT 
AGCCCCTGCX: 
CX3AAGCTCAA 
AATCTATGGC 
TGGCGGAGGC 
TGGCCATCC3G 
GCATGCGCAA 
TCTTCATACT 
CCC3V6CA06A 
ACATCCACAA 



1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
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60 
120 
180 
240 
300 
360 
420 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 



309 



10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



WO 02/086443 

CCTCTCTGCA TTTTATQACA QTQAGCTCTT 
AAGGAAAAT6 ATCCTGCAGC AGTTCTGA60 
TTCTGGTTTG GGGTOOTCAG TOCCCTCTGT 
T6AACT06G0 GTACTAGGGT CAGGGCTTAT 
TGTTTQTTTC TCCAAGCCTG CTTTGTGCTT 
TOTCTTACTT GGTTGCTGAA CATCTTGCCA 
TTQGATCAGA GCTGCTGAGT TCAGGAT60C 
ATGGATQTCA GQAQAOCTGC TGCCCTCTTG 
TGCCTTTGGC CA6AGAGCTG GTTGAAGATG 
TCTGT6CCCC TGTGGTG6AA 6A06GCAG8A 
GTCGCAGGGG TGGGATGTGA GTCATQOGGA 
TTGCTCCCTG ItriXiWCCC CACTCTCTTA 
TAATTTTTAA TAAAOTTGAA TAAAATATAA 

Seq ZD NOt 329 Protein sequences 
Protein Acceesion #t AAK17490.1 



CAOGATGAAC 
CCCTATGCCA 
GCTTTATGGA 
AQCAGGATGT 
CTCACCTTTG 
CCTCOGAGTG 
TGCQTGTGGT 
GCGTGAGTTG 
TTTGTAATCG 
CAGTGOCAGC 
TTATCCACTC 
TTTGT6CATT 
AAAAAAAAAA 



1 
I 

MAB SSBSP TM 
OPIfSBEDGE 
OREAGRGLGR 
SVREWVSMAG 
BHVLAYFLPE 
HLNQLIRTSG 
PEVNMEBTIY 
HYDGSLKTAN 
lAPSIYGHED 
SRAIPTTGQ6 
HEAMEQQSXS 
CWRDTVDPV 
KXYIIYAKER 
HLRDYVIEDD 
AEQVTYQRNR 
LQQF 



11 

I 

ASSPAQRRRG 
BLIQDGMERD 
NRRGLIiYDSD 
PRIiBIHHRFK 
APAELLQIFD 
WTSCTGVLP 
QNYQRIRIQE 
GFPVFATVir* 
IKRGLALAI<F 
ASAVGLTAYV 
ISKA6IVTSL 
QDEMLARPW 
VHPKLNQHDQ 
VNMAIRVMIiB 
FGAQQDTIEV 



MDPLTSSPGR 
YRAIPBUIAY 



NFLRTHVDSH 
EAALEWIiAM 
QLSMVKYNCM 
SPGKVAAGRL 
ANHVAKKDNK 
GGBPKNPGGK 
QRHFVSRENT 
QARCrVIAAA 
GSHVRHHPSK 
DKVAKMYSDL 
SFIDTQXPSV 
PBKDIiVDKAR 



31 

I 

SSRRTDALTS 
BAE6LALDDE 
RRQVERATBD 
GHNVFKERIS 
YPKYDRITNH 
KCNFVLGPPC 
PRSKDAILIiA 
VAVGELTDED 
HKVRGDIKVL 
LEAGALVIiAD 
NPIGGRYDPS 
KEEEGLANGS 
RKBSMAT6SI 
HBSMKKTFAR 
QINIBIlIiSAF 



AAGTTCA6CC 


AGGACCVQAA 


2760 


tcx:ataag6a 


TTCCTTGGGA 


2820 


CACAAAACCA 


GAGCACTTGA 


2880 


CTGGCTGCAC 


CTGGCA7XSAC 


2940 


GQTGGGAIGC 


CXT6CCAGT6 


3000 




ACTCAGTACC 


3060 


TTAGGTGTTA 


GCCTTCTTAC 


3120 


CGTATTCAGG 


CTGCTTTTGC 


3180 


TTTTCAGTCT 


CCTGCAGGTT 


3240 






3300 


OCCACAGTTA 


TCAGCTGCCA 


3360 


CX36TTTOGTT 
AAAAAA 


TCTGTAGTTT 


3420 


41 


51 




1 

8PGR0LPPFB 


1 

DSSB6LL0TE 


60 


DVEELTAfiQR 


BAAERAMRQR 


120 


GEEDEEMIGS 


lEHLEDLKGR 


180 


DNCKEHRESIi 


WNYEDLAAR 


240 


IHVRISHLPL 


VEELRSLRQL 


300 


QSQNQEVKPG 


SCPECQSAGP 


360 


DXjVDSCKPGD 


EIELTGIYHN 


420 


VKMITSLSKD 


QQIGEKZFAS 


460 


X.GGDPGTAXS 


QPLKYIBKVS 


540 


R6VCLZDBFD 


KMNDQDRTSZ 


600 


LTPSENVDIiT 


EPIISRPDIL 


660 


AAHPAMPNTY 


GVEPLPQEVIj 


720 


PITVRHIBSH 


ZRMAEAHARI 


760 


yiiSFRRDHNB 


ZiIiLFIXiXQLV 


840 


yDSBLFRMNK 


FSBDLKRKMI 


900 



PCT/US02/12476 



Seq ID NO: 330 DMA sequence 
Nucleic Acid Accession #i M17254 
Coding sequence: 257-1645 



GTCC3GCGC0T 
CGGGTOGCAC 
CTTTGGAQAC 
AAAGATGGCA 
TGGCTTACTG 
CTTATCAOR 
6GCTAAGACA 
CCCAOGCGTC 
GGAATGTAAC 
CAAAGGCGGG 
06AGGAGAAG 
AGCAGATCCT 
AGAATATGGC 
GTGCAAGATG 
TCTCTCACAT 
TGATAAAGCC 
TGA6CCCCCC 
TGCTCAACCA 
TTATCAGATT 
GCTTTGGCAG 
GGAAGGCACX: 
AGAGGGGAAG 
CTATGACAA6 
CCACGGGATC 
CTCAGACCTC 
GCCCCACCCT 
CTGGAATTGA 
TTCTCATCTG 
CAOCAOCCCA 
TGAAAAAAGC 
GAGGQAGTTA 
GGACATATCA 
AAGQACAAAG 
AATCCCACTA 
AACATACCGT 
TCAAAAACAA 
ACTGCATGGC 
CAGCTTTCTC 
ACTATGAACT 
AG6GGTGAA0 
TCTCAAGCAA 
ACTCGAGGGT 
GTAATaOAOA 
TGTCAAATGA 
TCATTATGTG 



11 
I 

QTCGGCXSCGC 
TAACTCCCTC 
CCGAGGAAAG 
GAACCAAGGG 
AAGQACATGA 
6TGAGTGA6G 
GAGATGACC6 
CCTCAGCAGG 
CCTAGCTAGG 
AAGATGGT6G 
CACAT60CAC 
ACGCTATGGA 
CTTCCAGACG 
ACCAAGGACG 
CTCCACTACC 
TTACAAAACT 
AGGAGATCAG 
TCTCCTTCCA 
CTTGGACCAA 
TTCCTCCTQG 
AAOSGGGAGT 
AGCAAACCCA 
AACATCATGA 
GCCCAGGCCC 
COGTACATGQ 
CCAGCCCTCC 
CCAACTGOGO 
GGCACTTACT 
TCGCCAGAAA 
TTTACTGOGG 
CTGAAGTCTT 
TCTGTGGACT 
TGCCAAAGAA 
ATGCAAACTG 
TTATAATGCC 
6AGAAAAGAC 
ATQTGCTGTT 
AAACTGTGAA 
AAAAGGTG6G 
AAGGAGGAGG 
TGAAQACT6G 
TCATGCAGTC 
AAGGOAAGTA 
AAATTTTAAC 
GGGGCTTTOT 



21 

I 

GOSTOTGCCA 
GGCGCCXjAOG 
CCGTGTTGAC 
CAACTAAAGC 
TTCAGACTGT 
ACCAGTOGTT 
CGTCCTCCTC 
ATTGGCTGTC 
TGAATGGCTC 
GCAGCCCAGA 
GCCCAAACAT 
GTACAGACCA 
TCAACATCTT 
ACTTCCAGAG 
TCAGAGAGAC 
CTCCA08GTT 
CCTG6AC0G6 
CAGTGCCCAA 
CAAGTA6CCG 
AGCTCCTGTC 
TCAAGATQAC 
ftCATQAACTA 
CCAAGGTCX3V 
TCCA6CCCCA 
6CTGCTATCA 
OCQTGACATC 
GTATATACCC 
ACTAAAGACC 
CTCTATCGGA 
CTGGGGAAGG 
ACTACAGAAA 
GACCTTGTAA 
AGTGGTCTTA 
GGATGAAACT 
ATTTTAAGGA 
ACGAGAQAGA 
TTGQTTQAAA 
OATQACCCAA 
ACTQA6GATG 
AAGAGGCAGA 
ACTCAGGACA 
AGTGTTATAC 
GTAGAATTCA 
TGGAATTGTC 
TCTCCACAOO 



31 
I 

GOGCXSGSTGC 
GOGGCGCTAA 
CAAAAGCAAG 
CGTCAGGTTC 
CCCGGACCCA 
OTTTGAOTGT 
CAGCGACTAT 
TCAACCCCCA 
AAGGAACTCT 
CACCGTT66G 
GACCACGAAC 
TGT6066CAG 
GTTATTCCAG 
GCTCACCCCC 
TCCTCTTCCA 
AATGCATGCT 
TCAOSGCXAC 
AACTGAAGAC 
CCTTGCAAAT 
GQACAGCTCC 
6GATCC0QAC 
0GATAA6CTC 
TGGGAAGOSC 
CCCCCCQGAG 
CGCCCACCXy^ 
TTCCAGTTTT 
CAACACTAGG 
TGGCGGAGGC 
GAACATGAAT 
AAGCC!GGGGA 
TGAGGAGGAT 
AAGACAQTGT 
AGAAATGTAT 
AAAGCAATA6 
AAACTACXTG 
CT0T6GCCCA 
TCAAATACAT 
AOTTTCCAAC 
TGIAT2U3AGT 
GAAG GAGGAG 
TTCGGGGACT 
CAAACCCAGT 
GAAACAAAAA 
TGATATTTAA 
GTCAflGTAAG 



41 
I 

CTTGOCOGIG 
CCTCTCGGTT 
ACAAATGACT 
TGAACAGCTG 
GCAGCICATA 
GCCTACQGAA 
GQACAGACTT 
GCCAGGQTCA 
CCTGATGAAT 
ATGAACTAOG 
QA60GCAGAG 
TGGCTGGAGT 
AACATCGATG 
AGCTACAACG 
CATTTGACTT 
AGAAACACAG 
CCCAG6CCCC 
CAGCGTCCTC 
CCAGGCAGTG 
AACTCCAGCT 
QA6GTGGCCC 
AGCOSGGCCC 
TACGCCTACA 
TCATCrCTGT 
CA6AAGATGA 
TTTGCTGCCC 

ctccccacca 
ttttcccatc 
caaaagtgcc 
agagatccaa 
gctaaaaatg 
atgtagaagc 
aaactttaqa 
aaacaacaca 

TATTTAAAAA 
TC AACAG ACG 
TCOGTTTGAT 
TCCTTTACAG 
GAGOQTGTGA 
ACCAGGCT66 
GTCTACAATG 
GT7AG6AGAA 
TQOGCATCrC 
GAOAAACATT 
AOATGGCCrT 



51 
I 

OGGQ008AGC 
ATTCCA6GAT 
CACAGAGAAA 
GTAGATGGGC 
TCAAGGAAGC 
C8CCACACCT 
CCAAQATQAa 
CXaTCAAAAT 
OCAGTGTGGC 
GCAGCTACAT 
TTATOGTGCC 
GG6G6GTGAA 
GGAAGGAACT 
CCGACATCCT 
CAGATGATGT 
ATTTACCATA 
AGTCGAAAaC 
AGTTAGATCC 
GCXAGATCXIA 
GCATCACCTG 
QGCGCTGGGG 
TCOGTTACTA 
AGTTCXyVCTT 
ACAAGTACCC 
ACTTTGTGGC 
CAAACCCATA 
GCCATATGCC 
AGCGTGCATT 
TCAAGA6GAA 
AGACTCTTGG 
TCACGAATAT 
ATGAAGTCTT 
GTAGA6TTTQ 
GTTTIGACCT 
TAGTTTCATA 
TTGATATGCA 
GGACAGCTGT 
TATTACCGGG 
TTGTA6ACAG 
GAAAOAAACT 
AGTTATGGA6 
AGGACACAGC 

CAGGAGCTCA 
CTTGGCIGGC 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
640 
900 
960 
1020 
1060 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 



310 



wo 02/086443 

ACAATCAGAA ATCA06CAGQ CATTTTGGGT AGQCGGCCTC CAGTTTTCCT TTQAGTCGCG 2760 

AACGCTGT6C GTTTGTCAGA ATGAAGTATA CAAGTCAATG TTTTTCCCCC TTTTTATATA 2820 

ATAATTATAT AACTTATSCA TTTATACACT AG6AGTTGAT CTCGGCCA6C CAAAGACACA 2880 

CGACAAAAGA QACAATCGAT ATAATGTGGC CTTGAATTTT AACTCTGTAT GCTTAATGTT 2940 

TACWVTATGA AQTTATTAGT TCTTAGAATG CA6AATGTAT GTAATAAAAT AAGCTTGGCC 3000 

TAGCAT6GCA AATCAGATTT ATACAGGAGT CTGCATTTGC ACTTTTTTTA 6TGACTAAAG 3060 

TTGCTTAATG AAAACATGTQ CT6AAT6TTG TGGATTTTQT GTTATAATTT ACTTI6TCCA 3120 
GGAACTTGTG CAAGGGA6AG CCAAGGAAAT AGGATGTTTG 6CAGCC 

Seq ID NO: 331 Protein sequence 
Protein Accession #t AAA52398 

1 11 21 31 41 51 

) I I I I ] 

MIQTVPDPAA HIKBALSWS EDQSLFECAY GTPHLAKTEM TASSSSDYGQ TSKMSPRVPQ 60 

QDWLSQPPAR VTIKMBCNPS QVNGSRNSFD ECSVAK6GKM VGSPDTVGMN YGSYMEBKHM 120 

PPFNT4TTXIBR RVIVPADPTL WSTDHVRQWL EKAVXEYGLP DVNILLFQm DQKELCRHTK 180 

DDFQRLTPSY NADILLSHia YLRETPLPHL TSDDVDKALQ NSPRLNHARN TDIiPYBPPRR 240 

SAHT6H6»PT PQSKAAQPSP STVPKTEDQR PQtiDPYQILG PTSSRIiAKPG SGQIQLUQFL 300 

LEILSDSSNS SCITWBGTNG EFKMTDPDEV ARRWGERKSK PNMNVDKLSR ALRYYYDKNI 360 

MTKVHGKRYA YKFDFHGXAQ ALQPHPPESS LYKYPSDLPY MGSYHAHPQK MNFVAPHFPA 420 

LPVTSSSFFA APNPYWNSPT GGIYPNTRIiP T8HNPSBL0T YY 462 

Seq ID HO: 332 DNA sequence 
Nucleic Acid Accession #: MM_000020 
Coding sequence: 263-1794 

1 li ' 21 31 41 ' 51 

I I I I I I 

AGGAAACGOT TTATTAGGAG GGAGTG6TGG AGCTGG6CCA GCSCAGQAAGA OSCTGGAATA 60 

AGAAACATTT TTGCTCCAGC CCCCATCCCA GTCCCGGQAG GCTGCC3GCGC CAGCTGCGCC 120 

GAGCGAGCCC CTCCCCX3GCT CCAGCCCX3GT CCGGGGCCGC GCCGGACCCC AGCCCGCCGT 180 

CXrAGCGCTGG CGGTGCAACT GOGGCGGCGC GGTGGAGGGG AGGTGGCCCC GGTCCGCCGA 240 

AGGCTAGCGC CCCGCCACCC GCAGAGOGGG CCCA6AGGGA CCAT6ACCTT GG6CTCCCCC 300 

AGGAAAGGGC TTC1GATGCT GCTGATGGCC TTGGTGACCC AGGGAGACCC TGTGAAGCOO 360 

TCTOGGGGCC CGCTGOTGAC CT6CA00TGT GAGAGCCCAC ATT6CAAGGG 6CCTACCTGC 420 

CGGGGGOCCT GGTGCACAGT AGTGCTGGTG CGGGAGGAGG GGAGGCACXX: CC31GGAAC3VT 480 

06G0GCTGCG GGAACTTGCA CAGG6AGCTC TGCAGGGGGC GCCCCACCGA GTTC36TCAAC 540 

CACTACTGCT GG6ACAGCCA CCTCTGCAAC CACAACGTGT COCTGGTGCT GGAG6CCACC 600 

CAACGTCCTT 0G6AGCAGCC GGQAACAGAT GGCCA6CTG0 CCCTGATCCT GGGGCCOGTG 660 

CTGGCCTTGC TGGCOCTGGT GGCCCTGG6T GTCCTGGGCC TGTGGCAT6T C0GA06GAGG 720 

CAGGAGAAGC AGC6TG6CCT GCACAGOGAO CTGG6AGAGT CCAGTCTCAT CCTGAAAGCA 780 

TCT6AGCAGG GOSACACGAT GTTGGGGGAC CTCCTGGACA GT6ACTGCAC CACAGGGAGT 840 

GGCTCAGGGC TCCCCTTCCT G6T6CAGAGG ACAGTGGCAC GGCAGGTTGC CTTGGTGGAG 900 

TGTGTGG6AA AAGGGC6CTA TG6G6AA0TG TGG0GGG6CT TGTG6CACGG TGA6A6T6T6 960 

GCOQTCAAGA TCTTCTCCTC GAG6GATGAA CAGTCCTGGT TCCGGGA6AC TGAGATCTAT 1020 

AACACAGTAT TGCTCAOACA CGACAACATC CTAGGCTTCA TOGCCTCAGA CATGACCTCC 1080 

CX5CAACTC!GA GCACGCAGCT GTGGCTCATC ACGCACTACC ACGAGCACGG CTCCCTCTAC 1140 

GACTTTCTGC AGA6ACAGAC GCTGGAGCCC CATCTGGCTC TQAGGCTAGC TGTGTCCGGG 1200 

GCATGOQQOC TGGGGCAOCT GCAG6TGGA6 ATCTTOGGTA CACAGGGCAA AOCAGCCATT 1260 

6GCCAC0G0S ACTTCAAGAG OCQCAATGTG CTOGTCAAQA GCAACCT6CA GTGTTGCATC 1320 

GCCGACCTGG GCCTGGCTGT GATGGACTCA CAGGGCAG06 ATTACCTGGA CATCGGCAAC 1380 

AACCCGAGAG TGGGCACCAA GCGGTACATG GCACCC6AG6 TGCTGGACGA GCAGATCCGC 1440 

ACGGACTGCT TTGAGTCCTA CAAGTGGACT GACATCTGGG CCTTTGGCCT GGTGCTGTGG 1500 

GAGATTGCCC GCCG6ACCAT 06TGAATGGC ATCGTG6AGG ACTATAGACC ACCCTTCTAT 1560 

GATGTGSTGC CCAATGACCC CAGCTTTGAG GACATGAAGA AGGTGGTGTG TGT6GATCAQ 1620 

CAGACCCCCA CCATOCCTAA CCGGCTGGCT GCAGACCCGG TCCTCTCAGG CCTAGCTCAG 1680 

ATGATGCGQG AGTQCTGGTA CCCAAACCCC TCTGCCCX5AC TCACCGOSCT QCQGATCAAG 1740 

AAGACACTAC AAAAAATTAG CAACAGTCXIA GAGAAGCCTA AA6TGATTCA ATAGCCCAGG 1800 

AGCACCTGAT TCCTTTCTGC CTGCAGGGGG CTGGGQGGGT GGGGGGCAGT GGATGGTGCC 1860 

CTATCTQGGT AGAGGTAGTG TGAGTGTGGT GTGTGCTGGG GATGGGCAGC TGCGCCTGCC 1920 

TGGTCGGCGC CCAGCXCACC CAGCCAAAAA TACAGCTGGG CTGAAACCTG ATCGCCTGCT 1980 

GTCTGGCCIG CTCAAAGOGG CA06CTCOCT GACSQCXTTOGC TCTCTCCCCA CCCCTAIGGC 2040 

CA6CATGGTG CACCCCCTAC CACTCCCGGG ACAGGATGCA AAAGAGGCTC CAGAGTCAGA 2100 

GTGCCAAGCC AGGGAATCCC AGTCXTCAGAC TCAGAGCCCG GGCCTGCACT TTGCCCCCTG 2160 

CCCTTGATCA ACCCCACTGC CCCACCAGAG CTGCCAGGGT GGCACAGGGC CCTGTCCAGC 2220 

CCCTGGCACA CACTTCCCTG C CAGGC CTCA GCCTCTAGCA TAA6CTCCAG AGAGCCAGGG 2280 

CCCATC318TT TCTCTCTGTG GATTTGTATC TCAGCTOCAT GATGCCTTGG GCTTTCTSTC 2340 

TGCTCAACAA GA6TGCAGCT TGCTGAATGT CA6CTGCCTG AGAGAGGTGG GGCCTGACTT 2400 

ACTAGGGCAT TAAATCCTAA GAGGTCCTAC TGAGGTOTGG CAGGATCACA GGCCASTGGA 2460 

AAAAGGGCAG GTCAGATGGG CAAGGCCCAG 6ACTTTCAGA TTAACTGAGA G6ATATCGAG 2520 

GCCAAGCATG 6CAG6GGGAA 6GTCAGTGGG TGTCAAGAQA CCCAG6TCTG ACCCOGGATG 2580 

TTTGCTCGAT GTGACAAAAO CAGOCCTGTC TCAGGACCTT TTCTTTTCTT TTTTCCTTCT 2640 

TTTTTTTTTT GACAOGGAGT TTCGCTCTTG TTGTCCRGGC TAGA6TGCAA TGGCATGATC 2700 

CCAGCTCACC GC3WVCGTCTA CCTCCX:AGGT TCAAATCATT CTCTTGCCTC AGACTCCCGA 2760 

GTAGCTGGGA TTACAGGCAC ATGCCACCAT GCCTGGCTAA TTTTGTATAT TTAGTAGAAA 2820 

CAGGGTTTCA CCATGCTGGC CATGCTGGTT CTCX»ACTCC TGACCTCAGG TGTTCOVCCT 2860 

ACCTCAGCCT CCCAAAGTGC T6GG6TTACA GGrGTOASCC AT0QCX3CCTQ OOCAGGACCT 2940 

TTGTTTCTTA TCTACATATT GGAAGATTTO GTCCTGATGT OCTTTGAGGC TTCTTTAGCT 3000 

CTAGTTCTCT GACACTTCAG CCTATATCAC AGCTAACTTC YTCAGTCTCA TCTATTCCTT 3060 

ATGCTCCAGC CCCTGGCAAT TTGCCTCAAG ATGGGGGTTT GAAAATAACT TTACCTGACT 3120 

CAAGGAGTGT CTGGA6CACC TCCTAGTCTA AGTCTQCAAG CTCCAGTTCT TGCCTAAAAC 3180 

CATGCCAGTG GCCACCCTTG GGCTCAGACA GCTCTGGGCC TTTTGACCAC AAGCCAGCCC 3240 

CTC6CCCTCT CTGTG6CATA QTCTTCTCTG CCCCAGGACT GCAGGGCGGC TTCCTCCAAG 3300 

GCTTCCAAGG CTCAAAAGAA ATTTGGCTOC ATCXAAGAAG GCTCCAGCTC CCCTACTGGC 3360 

CCCTGGCTTC AGGCCCACAC CCCTGGGCCA GGSCCAGAGA GTGTGTCTCA GGAGAATTCA 3420 

ATGGGCTCTA GAGAGACACA CAGAAAGTTT GG6CATTTGG GAAATTTTCA A6GRTGTATG 3480 



311 
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PCT/US02/12476 



10 



15 



TATGGYTCAC 
GAAGTGQATT 
GACAAGGACA 
GCA6TGAAGA 
GCATCTTATG 
CATTGTGCAA 
TGGATGG6CT 
AATCCACCXA 
GOAACAAACT 
TGGAAAATCC 
GABAA0GG6G 
ATGCTTCTGT 
AGACGCTGTT 
ATGGTTAAAT 



6TATQ6WGCA 
GGAGGGGAGC 
GCCCCAAGGT 
AAGCTCrCCC 
TGTGTCTTCC 
GGCTCGGAAG 
AGGTTCCCAG 
6CCCACGAAT 
CCTGCTGAGA 
CTAAGAGAAO 
CCCAATG6CC 
ClGAOIGCAa 
T6T6GGAGCA 
CCTGAAAAAA 



GGTTGTCCTG 
TTGAGGAATA 
TGGGAAGACX: 
CGCTCCTGCT 
ACCATCCTCA 
AGAACCAGGA 
ATCATTAGGG 
CATCTCCCTC 
CCCCACAGCC 
GCCT6GGQQA 
AGGGAGTQAA 
GAAGGTGTTC 
CTGGGCTCAT 
AAAAAAAAA 



GTCCYKGGGT 
TAAGGAGCGG 
TGGCCTTAGT 
GTAATGACCC 
TGGTGGCACT 
AGTGAAACTG 
CASAGTTTGC 
TTT6AAGGAT 
AGAAACTGAA 
MA66AAKTGG 
GGA66TGG0G 
CAB6GTG6AA 
GCCTGGCACA 



GCAGGGAA6T 
GGGTGGAGAC 
CGTCCTCAGC 
AfiMTIAGCCT 
TTTCTAGGCC 
GOTGAAAACft 
ACGTCCTCTG 
TTTWATTTCT 
A6CAGCAGCT 
AGTGACA6G6 
TT6CT6A6A6 
ATTACACTTC 
CAATA6GTCT 



GOGCTGCAGG 
TCAGGCTATG 
CTAG6GCAGG 
CCCCAGGCCX5 
TGTCTCCCAG 
GAAAGCTCAA 
GTTCACTGGG 
ACTGQGTTTT 
CCCCAAA6CC 
GACAGQTAGA 
CAGTCTGCAC 
T0QTACCTG6 
GCAATAAACC 



3540 
3600 
3660 
3720 
37B0 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4260 



Seq ID NO: 333 Protein sequence 
Protein Accession #i MP_000011 



20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



1 

I 

MTLGSPRKGI* 



LILGPVLALL 
DCTTGSGSGL 
RETBIYNTVIf 
RLAVSAACX9L 
YLDIGMNPRV 
YRPPFYDWP 
TALRIKKTU2 



11 
I 

IJ4liIiMALVTQ 
NIjHRBLCRGR 
ALVALGVLGL 
PPLVQRTVAR 
LREDNILGFI 
AHLHVEIF6T 
GTKRYMAPEV 
NDPSFEDMKK 
KISNSP!EKPK 



21 
I 

GDPVKPSRGP 
PTEFVNHYCC 
WHVRRRQBKQ 
QVALVBCVGK 
A5DMTSRNSS 
QGKPAIAHRD 
LDEQIRTDCF 
WCVDQQTPT 
VIQ 



31 

I 

LVTCTCESPH 

DSHLonrnvs 

RGLHSEIiGBS 
GRYGEVHRGL 
TQLWLITBYH 
FKSRKVLVKS 
ESYKWTDIWA 
IPNRLAADFV 



41 
I 

CKGPTCRQAW 
LVLEATQPPS 
SLZLXASEQG 
NH6ESVAVKI 
SBGSLYDFLQ 
NLQCCXADLG 
FGLVLWEIAR 
LSGLAQMMRE 



51 
I 

CTWLVREEG 
EQPGTDGQLA 
DTMLGDLIiDS 



RQTLEPHLAL 

LAVMHSQOSD 
RTIVNGIVro 
CWYPNPSARL 



Seq ID NO: 334 UNA sequence 

Nucleic Acid Accession #: NM_004126.1 

Coding sequence: 108-329 



GGCACGAQCT 
AGCG6CTCCG 
ACATCGAA6A 
AAGTGAAGTT 
AA6AACGTTC 
AAGAAAAAGG 
AGAACTAGTT 
TQAAATTAAA 
QTQCTACTCA 
ACAATTATGG 
GCTTCAAATA 



11 

I 

CGTGCCGGCC 
CTGCCA6AQC 
TTT6CCAGA6 
GCAGAGACAA 
TGGAGAG6AT 
CAGCTGTGTT 
TGTTTTAOTT 
AGGAIC2ACTTT 
TCTTTGCTCA 
AA ATAAGAAC 
AA6TTTTGTC 



21 

I 

TTCAGTTGTT 
TAGCCGQAGC 
AAGGAAAAAC 
CAAGT6TCTA 
CCTCTAGTAA 
ATTTCATAAA 
TTCCCASATA 
CTTAAGCACC 
CTATGCAGTC 
ATTACTTGAO 
TT 



31 

I 

TCGGGACQCG 
GCGGTTCTGG 
TGAAAATGGA 
AATGTTCTGA 
A6GGAATTCC 
TAACTTGGGA 
AAACCAACAT 
ATATAGATAO 
TTTTTTAAGA 
CATGACACTT 



41 

! 

CCQAGCTTOS 
GGG6AAAATG 
AGTTGA6CAG 
AGAAATAAA6 
AGAAGACAA6 
GAAACTGCAT 
GCTTTTTAAO 
GOTTATSTAT 
GA0CAGAGA6 
CTTTCAOTAT 



51 

I 

CCGCTCTTCC 
CCTQCCCTTC 
CTT06CAAAG 
AACTATATTG 
AACCXXrrTTA 
CCTAAGTGGA 
QAAGGAAGAA 
AAAAGCATAT 
TATCA6ATGT 
ATTGCTTGAT 



Seq ID NO: 335 Protein sequence 
Protein Accession ft: NP_004117.1 



41 



51 
I 



1 11 21 31 

I I I I 

HPALBIEDIiP SXBKLKMEVE QLRRBVKLQR QQVSKCSEBI RNYIEBRSGE DPLVKGIPED 
KNPFKBIQBSC VIS 



Seq ID NO: 336 DNA sequence 
Nucleic Acid Accession #i NMJ005795 
Coding sequence: 555-1940 



1 
I 

6CACGAGG6A 
CAAQCTCT6C 
TTCCCACCTT 
TGAGAATATT 
AAGAAATTCT 
GACAATTQTG 
GAATAATAAA 
AAAGAAAACT 
ACAAGGTTGC 
ATTTGGGCTT 
TTATGATTCT 
TTACTAGAAA 
CCATTCAACA 
ACQATGTTGC 
ATCCATCAGA 
CAAGCAACA6 
AGACTGCACT 
TGCTTATCTC 
TACACAAAAA 
CT6CA6TQ6C 
AGTTCATTCA 
ACCTACACAC 
ATTTTCTTGG 



XI 
1 

ACAAOCTCTC 
TAACTGAATC 
GCTTGTGGGT 
TCACAAAGAA 
TAAA6ACAAT 
CATATCGTCT 
AACCCATACT 
ACTACAACTT 
TATAAAACAA 
AATGATGGAG 
TGTTACA6CA 
TAAAATCATG 
AGCAGAA6GC 
AGCAGGAACT 
AAAAGTTACA 
AACATGGACA 
AAATTTGTTT 
GCTTOGCATA 
TCTOTTCTTC 
CAACAACCA6 
TCTTTACCTG 
ACTCATTGTG 
CTGGGGATTT 



21 
I 

TCTCTSCAGC 
TCATCCTAAT 
AAATCTCTTC 
TTTCCTTAAG 
GTCAAATATG 
AATAATAAAA 
AGCCTATAGA 
GACAAGACTG 
GATTGCTACA 
AAAAAGTGTA 
GAATTAQAAG 
ACAGCTCAAT 
GTTTACTGCA 
OAATCAATGC 
AAGATCTGTG 
AATTATACCC 
TACCTGACCA 
TTCTTTTATT 
TCATTTGTTT 
GCCTTAGTAO 
ATGGGCTGTA 
GTGGCCQTGT 
CCACTGATTC 



31 

I 

AGAGAGTGTC 
TGCAGGATCA 
TGCGGAATCT 
AGCTGGACTQ 
ATCCAA8A6A 
ACCCATACTA 
AAACAATATT 
CTGCAAACTT 
ACTTCTAGTT 
CCCT6TATTT 
AGAGTCCTQA 
ATQAATGTTA 
ACAOAACdG 
AGCTCIOCCC 
ACCAAGATGG 
AGTGTAATGT 
TAATTGGACA 
TCAAGAGCCT 
GTAACTCT6T 
CCAC AAATC C 
ATTACTTTTG 
TTGCAGAQAA 
CTOCTTGTAT 



41 

I 

ACCTCCTGCT 
CATT6CAAAG 
CA6AAAGTAA 
GGTCTTGACC 
AAATGTGATT 
GCCTATAGAA 
TGAAAGATTG 
CAATTGGTCA 
TATGTTATAC 
1-Cl'GGTTC T C 
GQACTCAATT 
CCAAAAGATT 
GGATGG ATGG 
TGATTACTTT 
AAACTGGTTT 
TAACACCCAC 
CGGATTGTCT 
AAQTT6CCAA 
TGTAACAATC 
TGTTAGTTGC 
GATGCTCTGT 
GCAACATTTA 
ACATGCCATT 



51 

I 

TTAGQACCAT 
CTTTCACTCT 
AGTTCCATCC 
CCTQ6AATTT 
TGAGTCTGGA 
AACAATATTT 
CTACCACTAA 
OCACAACTTG 
AGCATATTTC 
TTGCCTTTTT 
CAGTTGG6AG 
ATGCAAGACC 
CTCTGCTGGA 
CAGQACTTTO 
AGACATCCAO 
QAGAAAGTGA 
ATTGCATCAC 
AGQATTACCT 
AnCACCTCA 
AAAGTGTCCC 
GAA66CATTT 
ATGTGGTATT 
GCTAQAAOCT 



60 
120 
180 
240 
300 
360 
420 
480. 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
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TATATTACAA TGACAATTGC TQQATCAGTT CTQATACCCA TCTCCTCTAC ATTATCCATG 1440 

GCCCAATTTa TGCTGCTTTA CTGGTGAATC TTTTTTTCTT GTTAAATATT GTACGC6TTC 1500 

TCATCACCAA GTTAAAAOTT ACACACCAAG OGGAATCCAA TCTGTACATG AAAGCTGTGA 1560 

GAGCTACTCT TATCTTGGTG CCATTGCTTG GCATTGAATT TGTGCTGATT CCATGGOSAC 1620 

CTGftAGGAAA GATTGCAGAG GAGGTATATG ACTACATCAT GCACATCCTT ATGCACTTCC 1680 

AOGGTCTTTT GGTCTCTACC ATTTTCTGCT TCTTTAATGG AGAGQTTCAA 6CAATTCTGA 1740 

6AAGAAACT6 GAATCAATAC AAAATCCAAT TTGGAAACAG CTTTTCCAAC TCAGAAGCTC 1800 

TTOGTAOTOC GTCTTACACA GTGTCAACRA TCRGTGATGG TCCAQGTTAT AGTCATGACT 1860 

GTCCTAGTGA ACACTTAAAT GGAAAAAGCA TCCATGATAT TGAAAATGTT CTCTTAAAAC 1920 

CAGAAAATTT ATATAATTGA AAATAGAAGG ATGGTTGTCT CACTGTTTGG TGCTTCTCCT 1980 

AACTCAAGGA CTTGGACCCA TGACTCTGTA GCCAGAAGAC TTCAATATTA AATGACTTTG 2040 

GOOAATOTCA TAAAGAAGAG CCTTCACATG AAATTAOTAO TaTOTTGATA AflAGTOTAAC 2100 

ATCCAGCTCT ATGTGGGAAA AAAGAAATCC TQGTTT6TAA TGTTTGTCM TAAATACTCC 2160 

CACTATGCCT GAT6TGACX3C TACTAACCTG ACATCACCAA GTGTGGAATT GGAGAAAAGC 2220 

ACAATCAACT TTTCT6AGCT GGTGTAAGCC AGTTCXIAGCA CACCATTGAT GAATTCAAAC 2280 

AAATGGCTGT AAAACTAAAC ATACATGTTG GGCATGATTC TAOCCTTATT CSCCCCAA6A 2340 

GACCT AGCTA AGGTCTATAA ACATGAAGGO AAAATTAGCT TTTAGTTTTA AAACTCTTTA 2400 

TCCCATCTTO ATTOGGGCAG TTGACTTTTT TTTTTTCCCA GAGT6CCGTA GTCCTTTTTO 2460 

TAACTACXICT CTCAAATGGA CAATACCAGA AfSTGAATTAT CCCTGCTGGC TTTCTTTTCT 2 520 

CTATGAAAAG CAACTGAOTA CAATTGTTAT GATCTACTCA TTTGCTGACA CATCAGTTAT 2580 

ATCTTGTGGC ATATCCATTG TGGAAACTGG ATGAACAGGA TGTATAATAT GCAATCTTAC 2640 

TTCTATATCA TTAGGAAAAC ATCTTAGTTG ATGCTACAAA ACAOCTTGTC AACCTCTTCC 2700 

TGTCTTACCA AACAGTGGGA GGGAATTCCT AGCTGTAAAT ATAAATTTTG CCCTTCCATT 2760 

TCTAC TGTAT AAACAAATTA GCAATCATTT TATATAAAGA AAATCAATGA AGGATTTCTT 2820 

ATT T TCTTG G AATTTTGTAA AAAGAAATTG TGAAAAATGA GCTTGTAAAT ACTCCATTAT 2880 

TTTATTTTAT AGTCTCAAAT CAAATACATA CAACCTATGT AATTTTTAAA GCAAATATAT 2940 

AATGCAACAA T0IGT6TAT6 TTAATATCTG ATACTOTATC TGCGCTGATT TTTTAAATAA 3000 
AATAGAGTCT GGAATGCT 

Seq ID NOt 337 protein sequence 
Protein AccesBion ft : NP_005786.1 

1 11 21 31 41 51 

I I I I I I 

NBKKCTLYFL VLLFFFMIIiV TAELEBSPBD SIQL6VTRMK ZMrAQyECYQ KIMQDPZQQA 60 

EGVYOJRTW) GMLCMNDVAA GTBSMQLCOT YPQDFDPSEK VTKICDQDGN HPRHPASNRT 120 

WTNYTQCNVN THEKVKTAUI LBYLTIIGHG LSIASLLISL GIPPYPKSLS CQRITLHKNL 180 

PPSPVCNSW TIIHLTAVAN NQALVATNPV SCKVSQPIHL YLMGCNYFWM LCEGIYLHTL 240 

IWAVPABKQ HUWYYFLGW GPPLIPACIH AIARSLYYMD MCWISSDTHL LVIIHGPICA 300 

ALLVNLFPLL NIVRVLITKL KVIHQABSKL YMKAVRATLI LVPLLGIEFV LIPHRPE6KI 360 

AEEVTOYIMB ILMHFQGLLV STIPCFPNGE VQAIIiRSNKH QyKIQPGNSF SNSBALRSAS 420 
YTVSTISDGP 6YSHDCPSEH LNGKSIHDIB NVLLKPBNLY N 

Seq ID NOs 338 DNA sequence 
Nucleic Acid Accession #t NM_001795 
Coding sequence: 25-2379 " 

} 11 21 31 41 51 

I t I I i I 

GCACQATCTG TTCCTCCTGG GAAGAT6CAG AGGCTCATGA TGCTCCTCGC CACATGGGQC 60 

GCCTGCCTGG GCCTGCTGGC AGTGGCA6CA GTGGCAGCAG CAGGTGCTAA CCCTGCCCRA 120 

CGOGACACCC ACaGCCTGCT GCCCACCCAC CGGCGCCAAA AGAGAGATTG GATTTGGAAC 180 

CAQATOCACA TTQATOAAGA GAAAAACACC TOVCTTCCCC ATCATGTAGG CAAGATCAAG 240 

TCAAGCGTGA GTOGCAAGAA TGCCAAOTAC CTGCTCAAAG GAGAATATGT GGGCAAGGTC 300 

TTCCGGGTCG AT6CAGAQAC AGGAQACGTO TTCGCCATTG AGAC3GCTGGA CCGGGAGAAT 360 

ATCTCAGAGT ACCACCTCAC TGCTGTCATT GTGGACAAGG ACACTGGTGA AAACCTGGAG 420 

ACTCCTTCCA GCTTCACCAT CAAAGTTCAT GACGTGAAGG ACAACTGGOC TGTGTTOVOG 480 

CATOGGTTGT TCAATGCGTC CGTGCCTGAG TOGTOG GC TG TQGGQACCTC AGTCATCTCT 540 

GTGACAGCAG TGGATGCAQA CGACCCCACT GTGGGAGACC ACGCCTCTGT CATGTACCAA 600 

ATCCT GAAG G GGAAAGAGTA TTTTGCCATC GATAATTCTG GACGTATTAT CACAATAACG 660 

AAAA6CTTGG ACCGAGAGAA GCAGGCCAGG TATGAGATCG TGGTGGAAGC GCGAGATGCC 720 

CAGGGCCTCC GGGGGGACTC GGGCAOSGCC ACCGTGCTGG TCACTCTGCA AGACATCAAT 780 

GACAACTTCC CCTTCTTCAC CCAQACCAAO TACACATTTG TOGTQCXTOl AGACACXX38T 840 

QTGGGCACCT CTGTGGGCTC TCTGTTTGTT GAGGACCCaWS ATGAGCOCCA 6AACGGGATG 900 

ACCAAGTACA GCATCTTGCG GGGCGACTAC CAGGACGCTT TCACCATTGA GACAAACCXX: 960 

GCCCACAACG AGGGCATCAT CAAQCCCATG AAGCCTCTGG ATTATGAATA CATCCAGCAA 1020 

TACAGCTTCA TOSTCXJAGGC CACAGACCCX: ACCATCQACC TCOGATACAT GAGCOCTCCC 1080 

GCGGGAAACA GAGCCCAGGT CATTATCAAC ATCACAGATG TGGACGAGCC CCX3CATTTTC 1140 

CAGCAGCCTT TCTACCACTT CCAGCTGAAG GAAAACCAGA AGAAOCCTCT GATTGGOVCA 1200 

GTGCTGGCXa TGGACCCTGA TGOGOCTAGG CATAGCATTG QATACTCCAT CCGCAGGACC 1260 

AGTOACAAGG QCCAOTTCTT C06AGTCACA AAAAAGGGGG ACATTTACAA TGAGAAAGAA 1320 

CTGGACRQAG AA6TCTACCC CTGGlATAAC CTGACTGTGG AGGCCAAAGA ACTGGATTCC 1380 

ACTGGAACCC CCACAGGAAA AQAATCCATT GTGCAAGTCC ACATTGAAQT TTTGGATGAG 1440 

AATGACAATQ CCCCGGAGTT TGCCAAGCCC TACCAGCCCA AAGT6TOTGA GAACGCTGTC 1500 

CATGGCCAGC TGGTCCTGCA GATCTCOGCA ATAGACAAGG ACATAACACC ACGAAACGTG 1560 

AAGTTCAAAT TCRCCTTQRA TACTGAGAAC AACTTTACrC TCACGGATAA TCACGATAAC 1620 

ACGGOCAACA TCACAGTCAA GTATGGGCaG TTTGACCGGO AGCATACCAA GGTCCACTTC 1680 

CTACCCGTGG TCATCTCAGA CAATGGQATG CCAAGTCGCA OGGGCACCAG CACGCTGACC 1740 

6TGGCCX3TGT GCAAGTGCAA CQAGCAG66C GAGTTCACCT TCTGOSAGGA TATGGCCGCX; 1800 

CAGGTGGGCG TGAGCATCCA GGCAGTGGTA GCCATCTTAC TCTGCATCCT CACCATCACA 1860 

GTGAT CAOC C TGCTCATCTT CCTGOQGCSGG OtSGCTCOGGA A6CAGGCCC6 CGCGCACGGC 1920 

AAGAGCGTGC CGGAGATCCA CGAGCAGCTG GTCACCTAC6 ACGAGGA6G6 CGGCGGCGAG 1980 

ATGGACACCA CCftGCTACGA TGTGTCGGTG CTCAACTCX5G TGCGCCGCGG CGGGGCCAAG 2040 

CCCCCGCGGC CCGCGCTGGA CGCCCGGCCT TCCCTCTATG CGCAGGTGCA GAAGCCACCG 2100 

AGGCACQOGC CTGGGGCACA CGGAGGGCCC GGGGAGATGQ CAGCCATGAT 0GA6GTGAAG 2160* 

AAGGAGQAGG OGGACCAOSA GGGOGAOGGC COCCGCTAOG ACACGCTGCA GATCTACGGC 2220 

TACQAGGGCT OCGAGTCCAT AGOCXSAGTCC CTCAGCTCCC TGGGCACOGA CTCATCCOAC 2280 
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TCTQACQTGG ATTAOGACTT CCTTAACGAC TGGGGACCC3V 
CTGTAGGGCT G6GACCC0CG GGAOGAGCIG CTGTATTAG6 
CTGGGGACCC AAACCCCCTG CAGCOCASGC CAGTCAGACT 
AAATCGCAGT GACTCCCCAG CCCAGCAOCX: CTTCCTOSTG 
CCTTGGGATA GCAAACTCCA QOTTCCTQAA ATATCCAGGA 
TTCTCAAATG CTG6CAAATC CAGGCTGGTG TTCTGTCTGG 
CTGTCftGCCA CAGACCGCCO TCTAACTCAA AQ ACTTCC TC 
GCAAAACASA CTGTGTTTAA CTGCTOCAGQ GTCTTTTTCT 
TAAGGCTGGT GAGGTCCTGG TGCCTATCTG CCTG6A6GCA 
TGTGGGGCAG GATTCTCTGC AGCCCATTCC CAAGGGAGAC 
GGGAGCCCTA GCCCTGCTCC AACTCCATAC TCCACTCCAA 
COCTCTCCAG GCCIGTCAAG AGGGAGGAAG GGGCCCCATG 
CTGAAGTGAC CTCACTGGCC TGCCATGCC3V GTAACTGTGC 
ATTCAGGGAA ATGCTTATTA AACCTTGAAG CAACTQTGAA 
QAGATCAGGA GTGACAGATC ACAGGGTGAG GGCCACCTCC 
AGGCCTGGAA GAGCTGAGAC CTTGCTTTGA QACTCCTCAG 
AGAAGGGGCA GATGTTCCCX3 GAGATCAGAA 6ACX3TCTCCC 
GCCAATCCAT GCTCTCTTTC TTTTCTCTGT CTACTCCTTA 
CAAGATGTGG CCTTTAGCAA AACTGACAAT GTCCAAACCC 
GCCGAGCATG TGTCTTTACA CCTCGCTGTT GTCACATCTC 
CACCTTGCAG AAGGAAGGCX: CTGCCCTGCC GAACXrTCTGT 
ACTGGAACXST TTCACTGCAA ACACACCTTG 6AGAAGTG6C 
AG6GAAGGAG ACACCAAGCT CACCCTTCGT CATGGACXXSA 
CCCrCACACT GCAAGGGATT OTAGATAACA CTGACTTGTT 
TTCTTATAAT GATTTTTTTA CTAATGATAC TTACAA6TTT 
GAATAAQGGT TTTTGCATAA TAAGCAGGTT GTTATTTAGG 
TTTTTAGTTG GAAAAACAAT TCCTGTAACC TTCTATTTTC 
TACAGATAAT GTCTATATAT TGGCCAAACT GGTGCATGAC 
CCTAAATAAA GAAAAATCTT TAGCCTGGGC AACAAAAAAA 

Seq ID NOi 339 Protein sequence 
Protein Accession #: NP ooi7B6 



PCTAJS02/12476 



NQRLMMLLAT 
NTSIiPHHVGK 
VIVDKDTQEM 
PTVGDHASVM 
TATVLVTLQD 
DYQDAFTZET 
miTDVDEPP 
VTKKGDIYNB 
KPYQPKVC^ 
GQFDREKTKV 
WAILLCILT 
SVLNSVRRGQ 
OGPPYDTLHI 
ELLY 



11 
I 

SGACLGIJAV 
IKSSVSRKKA 
LBTPSSPTIK 
YQILKCKEYF 
INDNPPPPTQ 
NPAHMBGIZK 
IPQQPFYHPQ 
KBLDREVYPW 
AVHGQtiVLQI 
HPLPWISDN 
ITVITLLIFL 
AKPPRPALDA 
YGYBQSBSIA 



21 
I 

AAVAAAGANP 
KYLLKQEYVG 
VHDVNDNWPV 
AZOKSGRIIT 
TKYTPWPED 
PMKPLDYEYI 
LKBNQKKPLI 
YNLTVEAKEL 
SAIDKDITPR 
QMPSRTGTST 
RRRLRKQARA 
RPSIiYAQVQK 
ESLSSLGTDS 



31 
I 

AQRDTHSLLP 

KVFRVDAETG 
FTHRLFNASV 
XTKSLDREKQ 
TRVGTSVOSL 
QQYSFIVEAT 
GTVIiAMDFDA 
DSTGTPTGKE 

NVKFKFrmr 

LTVAVCKCMB 
HGKSVPEIHB 
PPSaAPOAHG 
SDSDVDYDFL 



GGTTTAAGAT 
CX3GC0QAGGT 
CGAGGCACCA 
GGTCCCAGAG 
ATATATGTCA 
GCTCAGACAT 
TGQCTCCCCA 
AGGGTCCCTG 
AAGGCCTGGA 
TGACCATCAT 
GTGCCCCACC 
6CA6CTCCTG 
TGTACTGA6C 
TTCATTCTGQ 
ACACCCACCC 
CACCCCTCCA 
CTTCTCTGCC 
TCCCTTGGTT 
ACTCATQACT 
AGGGAACTGA 
QGTCACCCAT 
ATCAGTCAAC 
GGTTCCCACT 
TGTTTTAACC 
CTAGCTCTCA 
TTAACAATAT 
TATAATTGTA 
AAGTACTGTA 



41 

I 

THRRQRRDWI 
DVPAIERUJR 
PB8SAVGTSV 
ARYEIWBAR 
PVHDPDEPQN 
DPTXDLRYKS 
ARHSIGYSIR 
SrVQVHIBVL 
BNNFTLTOKH 
QGEFTFCEDM 
QLVTYDEBOG 
QPGEHAAHZE 
NDHGPRFKNL 



GCTGGCPGAO 
CACTCTGGGC 
CA6CCTCCAA 
ACCTCATOWS 
GTGATGACTA 
CCACATAACC 
AGGCTGCAAA 
AACGCCCTGG 
CAGCTTQACT 
QCCCTCTCTC 
ACTCCCCAAC 
ACCTT6GGTC 
ACTGAACCAC 
AGGGGCAGT6 
CCTCTGGAGA 
GTTTTGCCTG 
TCACCTGGTC 
TAGAGGAACC 
GCATGAOQGA 
CCCTCAGGCA 
GCATCATTCC 
AGAGAGGGGC 
CTGGCAAAGC 
AATAACTAGC 
CAGACATATA 
TAATTCAGOT 
GTAATTGCTC 
TrTTTTTATA 



51 
I 

NNQHBIDBEK 
ENISEYHLTA 
ISVTAVDADD 
OAQGIiRGDSG 
RNTKySIldtG 
PPA(3ilRAQVI 
HTSDKGQPFR 
DEKDNAFEFA 
DNTANITVKY 
AAOVGVSIQA 
GEWDTTSYDV 
VKKDBADBDO 
ACLY85DFRE 



Seq ID NO: 340 DNA sequence 
Nucleic Acid Accession #: NM_003086 
Coding sequence: 112-1593 



GCGGAGGGTG 
CCXX3CCACCC 
AAOGOCACAG 
CTGAG6GCCG 
CAGATCTGGA 
AGCCACCTGG 
GTGCCOGGTC 
CA6TCG6AG0 
CAGA00QT6T 
ATCTACAGTQ 
6CCGTGGACC 
CAGCGCTACA 
G06CGCCCCG 
CGCGACTGGG 
AAGGCCACCA 
GT6CTGCAGG 
AATCAGQAOQ 
AA6TGTGCCT 
CAGTCCACCQ 
CGCATCACAC 
GCCGCCTCGG 
CCCATCATCG 
CTGGACGCCA 
AACATCAAAG 
AG0GGC6ACA 
AAQQTGGGOG 
ACGGTG6AGC 
CCACATGGOG 
GGCGGGAGGC 
CCTGTC6CCC 



11 

I 

CGTGCOGGCC 
ACCTCC0G6G 
CCQAGOOGOT 
AGGOGTTOSG 
06CTGGA6CA 
GCCGCTACCT 
CCGACTGCCX3 
G6CA006GCG 
CCCCC6CCGA 
TCACCOGTAA 
GCGACGTGCC 
GCGTGCAGAC 
AGCC36GCCAC 
AGGGCOGTTA 
AGGTGGGCAA 
CX3GCCAAC6A 
AGGAGACCGA 
TCXX3TACCCA 
CCTCCAGCAA 
TGAGGGOGTC 
1GGAGACAGC 
TGTTCCGCGG 
ACCGCTCCAG 
ACrCCACAGG 
CTCCTGTGGA 
GGC36CTACCT 
COGCCTOGCT 
GCTCCTGCCA 
AAGCCXXCTT 
CTATQGACTC 



21 
I 

6CGGCAGCCG 
GCCGCX3CAGC 
GCAGATCXAG 
GTTC3UU3GTG 
GCCCCCTGAC 
GGCGGGGGAC 
TTTCCTCATC 
CTACTTCGGC 
6AA6TGGAGC 
GCGCTACGCXS 
CTGGGGCGTC 
CGCCQACCAC 
TGGCTACAOG 
CCTGGCGCCX3 
GGACGAGCTC 
GAGGAACX3TG 
CCAGQAGACC 
CACGGGCAA6 
GAATGCCAGC 
CAATGGCAAG 
AGGGQACTCA 
G6A6CATGGC 
CTATOACGTC 
CAAATACTGQ 
CTTCTTCTTC 
OAAGQGCGAC 
CTG6QAGIAC 
ACCCTCCCTG 
GCCTTTCAAA 
OCCACTCTCC 



31 
I 

AACAAAGGAG 
GGCCTCTGGT 
TTCGGCCTCA 
AACGOGTCOG 
GAGGC9GGGCA 
AAGGACX3GCA 
GTGGCGCACQ 
GGCACCGAGG 
GTGCACATCG 
CACCTGAGOO 
GACTCXjCTCA 
C3GCTTCCTGC 
CTGGAQTTCC 
TOGGGGCCCA 
TTTGCTCTGG 
TCCAGGCGCC 
TTCCA6CTGG 
TACTGGA(3GC 
TGCTACTTTG 
TTTGTGACCT 
GA6CTCTTCC 
TTCATCGGCT 
TTCCAGCTGG 
ACGGTGGGCA 
GA67TCTGCG 
CAG0CAG60G 
TAG6GCC6GC 
CTAACCCCTT 
CTGGAAACCC 
CCTCCGCCCO 



41 

I 

CAGGGGCGCC 
CTACTGCCAC 
TCAACTGCGG 
CCAGCAGCXrr 
GCGCGGCOGT 
ACGTGACCTG 
ACGA0GGTCX3 
ACCGCCTGTC 
CCATGCACCC 
CGCGGCCGGC 
TCACCCTCGC 
GCCAOGAOGG 
QCTCCGGCAA 
GCGGCACGCT 
AGCAGAGCTG 
AG6QTATGGA 
AGAT0GACC9G 
TGAOGGCCAC 
ACATOQAGTG 
CCAAGAA6AA 
TCATGAAGCT 
6CCGCAAGGT 
AGTTCAACGA 
QTGACTCCGC 
ACTATAACAA 
TCCTGAAGGC 
CCGTCCTTCC 
CTCCGOCAGG 
CAGAGAAAAC 
GGTTCCCTAC 



51 
I 

GCCGCAGGGA 
CATGACCGCC 
CAACAAGTAC 
GAAGAAGAAG 
6TGCCTG06C 
CGAGOGCGAG 
CTGGTCGCTG 
CTGCTTCGCX5 
TCAGGTCaUVC 
CGACGAGATC 
CTTCCA6GAC 
GCXSCCTGGTG 
GGTGGCCTTC 
CAAGGCQGGC 
CGCCCA6GTC 
CCTGTCTGCC 
0GAC3^0CAAA 
OGGGGGCGTG 
GCGTGACCGG 
TGGGCAGCTG 
CATCAACOQC 
CACGGGCACC 
TGG06CCTAC 
GGTCACCAGC 
QGTGGCCATC 
CTOGGOGGAA 
CCGCCCCT6C 
TGGGCTCCA6 
GGTGCCCCCA 
TCCCCTCGGG 



2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 



60 
120 
180 
240 
300 
360 
420 
480 
- 540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
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TCAGCGGCTG CGGCCTGGCC CTGGOAGGGA TTTCAGATOC CCCTGCCCTC TTGTCTGCCA 1860 

GGGGGOQAGT CTOGGACCTC TTTCTTCTGA CCTCAGAGQQ CTCTGAGCCT TATTTCTCTG 1920 

GAAG CGGCTA AQG6ACX36TT GGGG6CTG6G A6CCCT6GGC GTGTAGTGTA ACTGGAATCT 1980 

TTTGCCTCTC CCAGCCACCT CCTCCCAGCC CCCCAGGAGA GCTGGGCACA TGTCCCAAGC 2040 

CTOTCAGTGG CCCTCCCTGG TGCACTGTCC CCGAAACCCC TGCTTGGGAA GGGAAGCTGT 2100 

CGGGAOGGCT AGGACTGACC CTTGTGGTGT TTTTTTGGGT GGTGGCTGGA AACAOCCX:CT 2160 

CTCCCA0GT6 G6AGAG6CTC AGCCTGGCTC CCTTGCCTG6 AGCGGCAGGG G6TGA086CC 2220 

ACA6GGTCTG CCC6CIGCAC GTTCTGCCAA GGTG6TGGTG G06GGCG6GT AGGGGTGTGG 2280 

GGGCCGTCTT CXTTCCrGTCT CTTTCCTTTC ACCCTAGCCT GACTGQAAGC AGAAAATGAC 2340 

CAAATCAGTA TTTTTTTTAA TGAAATATTA TTGCTGGAGG CX3TCCCAGGC AAGCCTGGCT 2400 

GTAGTAGGGA GT6ATCTGGC GGGGGGCGTC TCAGCACXTCT CCCCAGGGGG TGCATCTCAG 2460 

CCCOCTCTTT CXXtTCCTTCC CGTCCAGCCC CAGCCCT6G6 CXrTGGGCTGC GGACACCIGG 2520 

6CCAGAGCCC CTGCTGTGAT T6GT6CTCCC TGOQCCTCCC OOGTGGATGA AGCCA6GGQT 2580 

CGCOCCCTCC GGGAGCCCTG GGGTGAGCC6 CCGGGGCCCC CCTGCTGCCA GCCTCCCCCG 2640 

TCCCX:AACAT GCATCTCACT CTGGQTGTCT TGGTCTTTTA TTTTTTGTAA GTGTCATTTG 2700 

TATAACTCTA AACGCCCATG ATAGTAOCTT CAAACTGGAA ATAGOSAAAT AAAATAACTC 2760 
AGTCTGC 

Seq XD NO: 341 Protein sequence 
Protein Accession ft: NP_003079 

1 11 21 31 41 SI 

1 i I I I I 

MTANGTAEAV QIQFGItlNOG NKYLTABAFG FKVNASASSL KKKQIHTLEQ PPDEAGSAAV 60 

CLRSHI/atYL AADXDGNVTC EREVPGPDCR FLXVAHDDGR WSLQSBAHRR YFGGTBDRLS 120 

CPACnVSPAE KWSVHIAMHF QVMIYSVTRK RYAHLSAKPA DSIAVDRDVP WGVCSLITLA IBO 

FQDQRYSVQT ADHRPLRHDG RLVARPEPAT GYTLEPRSGK VAFRDCEGRY LAPSGPSGTL 240 

KAGKATKVGK DELFALEQSC AQWLQAANE RNVSTRQGMD LSANQDEETD QBTFQXiEIOR 300 

DTKKCAFRTH TGKYWTLTAT GGVQSTASSK MASCYFDIEW RORRITLRAS NGXFVTSKKN 360 

GQLAASVETA GDSBIiFLHKIi XHRPUVFRO BBCFIGCRKV TGTIiDANRSS YDVFQLBFHD 420 

OAYNIXDSTG imiTVGSDSA VTSSGDTtVO FFFEFCDlOnC VAIKVGGRYL KGOHASVIiKA 480 
SABTVDPASb WEY 



Seq n> NO: 342 DNA sequence 

Nucleic Acid Accession #: FQENESH predicted 

Ooding sequence: 660. .1705 

1 11 21 31 41 51 

I I I I I I 

CQCTCCGCAC ACATTTCCTG T0GC6GCCTA AGG6AAACTG TTGGCCGCTG 6GCC0006GG 60 

GGGATTCTTG GCAGTTGGGO GGTCGGTCGG 6AGCGAGGGC G6AGGGGAAG GGAGGGGGAA 120 

COGGGTTGGG GAAGCCAGCT GTAGAGGGOG GTGACCX3CGC TCCAGACACA GCTCTGOGTC 180 

CTCGAGCOGG ACA6ATCCAA GTTGGGAGCA GCTCTGG6TG CGGGGCCTCA GAGAATGAGG 240 

COGGCGTTCG CCCTGTGCCT CCTCTGGCaG GCOCTCTGGC CCGGGCOGGG CG6CG60GAA 300 

CACCCCACTG CCGACCGTGC TQGCTGCTOG GCCTCG6GGG C5CTGCTACAG CCTGCAOCAC 360 

GCTACCATGA AGOGGCAGGC GGCCGAGGA6 GCCTGCATCC TGCGAGGTGG GGCGCTCAGC 420 

ACCGTGCGTQ CGGGCGCCX3A GCTGCX30GCT 6T6CT0GC3GC TCCTGCGGGC AGGCCCAGGG 480 

CCCGGAGGGG GCTCCAAAGA CCTGCTGTTC TGGGTCGCAC TGGAGCGCAG GOGTTCCCAC 540 

TGCACCCTGG AGAACGA6CC TTTGCGGGGT TTCTCCTGGC TGTCCTCXX3A CCCCGGCGQT 600 

CTCGAAAGOG ACACGCTGCA GTGGGTGGAG GAGCCCCAAC GCTCCTGCAC CGCGCGGAGA 660 

TGCG06GTAC TCX:aG6CCAC CGGTGGGGTC GAGCCCGCAG CTGGAAGGAG ATGCX3ATGCC 720 

ACCT6C3GG6C CAAOGGCTAC CTGTGCAAST ACCA6TTTGA G6TCTTGTGT CCT60GC0GC 780 

GCCCGGGG6C CQCCTCTAAC TT6AGCTATC 6CGC6CCCTT CCAGCTQCAC AGCGCCGCTC 840 

TGGACTTCAG TCCACCTGGG ACOGAGGTQA GTGC3GCTCTG COGGGGACAG CTCCX3GATCT 900 

CAGTTACTTG CATCGOGGAC GAAATCGGCG CTCGCTGGGA CAAACTCTCG GGCX3ATGTGT 960 

TGTGTCCCTG CCCXX3GGAGG TACCTCOGTG CTGGCAAATG CGCAGAGCTC CCTAACTGCC 1020 

TAGAOQACTT GGGAGGCTTT GCCT6G6AAT GTGCTA06GG CTTGGA6CT0 GGGAAGGAOG 1080 

GC06CTCTTG TGTGACCAGT GGGGAAG6AC AGCCXSACXXT TGGGGGGACr GGG6T6CCCA 1140 

CCAGGCGCCC GCCGGCCACT GCAACCAGCC COGTGCCGCA GAQAACATGG CCAATCAGGG 1200 

TCGACGA6AA GCTGGGAGAG ACACCACTTG TCCCTGAACA AGACAATTCA GTAACATCTA 1260 

TTCCTGAGAT TCCTOGATGG GGATCACAGA GCAGGATGTC TACKCTTCAA ATGTCCCTTC 1320 

AAGC0GA6TC AAAGGCCACT ATCACCCCAT CAGGGAOC6T GATTTOCAAG TTTAATTCTA 1380 

CGACTTCCXC TGCCACTCCT CAGGCTTTCG ACTCCTCCTC TGCCGTGGTC TTCATATTTG 1440 

TGAGCaCAGC AGTAGTAGTG TTGGTGATCT TGACCATGAC AGTACTGGGG CTTGTCAAGC 1500 

TCTGCTTTCA CGAAAGCCCC TCTTCCCAGC CAAGGAAGGA GTCTATGGGC CCGCXX3GGCC 1560 

TGGAGAGTGA TCCTGAGCCC GCTGCTTTG6 GCTCCAGTTC TGCACATTGC ACSVAACAATG 1620 

GGGTQAAA6T OaGGGACTGrr GATCTGaSGG ACAGAGCAGA GGGT60CTT0 CTQGGQGAaT 1660 
OCCCTCTTGO CTCTAGTGAT 6CATA6 

Seq ID NO: 343 Protein sequence 
Protein Accession #: FGENESH predicted 

1 11 21 31 41 51 

i I I I I I 

t4GKDFHTECrP KAFATXAKID KSnSLIKLKSF CTAKBTIIRV NSQPTDWQKT FAZYPSDKGV 60 

ZARIYXELEQ lYKKRKPTKT LRTHFLSRPK GNCNPLGPRG DSWQLGGPSG ARABGKG6GT 120 

GliGKPAVEGG DRAPDTALRP RAGQIQVGSS SAOSASENEA GVRPVPPIAG ALARAGRRRT 180 

PHCRPCffLLG LGGLLQPAPR YHEAAGGRGG LHPARWGAQH RACGRRAARC ARAPAGRPRA 240 

RRGIiQRPAVL 6RTGAQAFPL HPGERAFAGF LLAVIiRPRRS RXRHAAVGGG APTLLHRAEM 300 

RGTPGBRNGR ARSHKEMRCH ItRANGYLOOr QFEVLCPAPR FGAASNLSYR APFQI£SAAL 360 

DFSPPGTEVS ALCRGQLPZS VTCIADEIGA R14DKLSGDVL CFCFGRYLRA GKCASLPNCL 420 

DDLGGFACBC ATQFELGKDG RSCVTSGBGQ PTLGGTGVPT RRPPATATSP VPQRTWPIRV 480 

DEKIiGETPLV PEQDMSVTSI PEIPRW6SQS TMSTLQMSLQ ABSKATITPS GSVISKFNST 540 

TSSATPQAPD SSSAWFIFV STAVWLVIL TMTVLGLVKIi CFHESPSSQP RKESMGPPGL 600 
ESDPEPAALO S86ABCINHG VKVGDCDURD RABQALLAES PU3S6DA 
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Seq ID NO: 344 Wh sequence 
Nucleic Acid Accession #: NM_012072 
Coding sequence t 149-2107 

1. 11 21 31 41 51 

ARAGCCCTCA OCCTTTOTGT CCTTCTCTGC GCCCSC5AGTGG CTQCAGCTCA CCCCTCA6CT 60 

CCCCTTGGGG CCCAGCTGGG AGCCQAGATA GAAGCTCCTG TCGCOOCTGO GCTTCTCGCC 120 

TCCCGCAGAG GGCC31CACAG AGACCGGGAT GGCCACCTCC ATOOOCCTCC TCCXOCIQCT IflO 

GCTGCTGCTC CTGACCCAGC CC6GGGC3QGG GAiOGGGAGCT QACAOSQAGO OSGTQGTCTG 240 
OQTGGGOACC GCCTOCTACA CGGCXCACTC GGGCAAGCTG AGC3GCTGC0G AGGCCCAGAA 
CCACTGCAAC CAGAACGGGG OCAACCTQQC CACTGTGAAG AGCAAGGAGG AGGCCCAGCA 

CCTCOVGCGA CTACTGGCCC AGCTCCTGAG GCGGGA6GCA 6CCCTGA0GG CX3AGGATGAQ 420 

CAAGTTCTGG ATTGGGCTCC AGOGAGAGAA GGGCAAGTGC CTGGACCCTA GTCTGOOGCT 480 

GAAOQOCTTC AGCTCGOTCG GCGQGGGGGA GGACACQCCT TACTCTAACT GGCACAAGGA 540 

GCraWGAAC TCOTGCATCT CCAA6CGCTG TGTGTCTCTO CTGCTGGACC TGTCCCAGCC 600 

OCTCCTTCCC AACOSCCTGC COVAGTGGTC TGAGGGCCCC TGTGGGAGCC CAGQCTCCCC 660 

OGQAAGTAAC ATTGA6GGCT TCGTGTGCAA GTTCftOCTTC AAAGGCATGT OCCGOCCTCT 720 

GGCCCTGGGG GGCCCAGGTC AGGTCACCTA CACCACXCCC TTCCAGACCA CCAOTTCCTC 780 

CTTGGAGGCT GTGCCCTTTG CCTCTGCGGC CAATGTAOCC TGTGGGGAAQ OTGACAAGGA 840 

0GAGACTC3VG AGTCATTATT TCCTGTGCAA GOAGAAGGCC CCCQATQTGT TCGACTGGGG 900 

CaGCTOGGGC CCCCTCTGTX; TCAGCCCCAA GTATGGCTGC AACTTCAACA ATGGGGGCTG 960 

CCACCAGOAC TQCTTTGAAG GGGGGQATCO CTCCTTCCTC TGOSGCTGCC 6ACCAGGATT 1020 

CCGGCTGCTG QATGACCTGG TGACCTGTOC CTCTCQAAAC CCTKKaGCT CCRflCCCRTG 1080 

TCGTGGGGGG GCCACGTGCG TCCTGGOAOC CCATG6GAAA AACTACACGT GCKGCTOCCC 1140 

CCAAGGGTAC CAGCTGGACT CGAGTCAGCT GGACTGTOTQ QACGTGGATG AATGCCAGOA 1200 

CTCCCCCTGT GCCCaGGAGT CTGTCAACAC CCCTGGGdGC TtCOSCTGCXS A^GCTCSS! llli 

TGGCTATGAG CCGGGCGGTC CTGGAfSAGGG GGCCTGTCAG GATGTGGAT6 AGTGTGCTCT 1320 

GGGTOSCTCG CCTTGCX3CCX: AQGGCTGCAC CAACACAORT GGCTCATTTC ACTGCTCXnXS 1380 

TOAGGAGGGC TACGTCCTGG CCGGGGAGGA CGGtSACTCAG TGCCAGGACX3 TGGATGAGTG 1440 

TQTGG6CCCG GGGGGCCCCC TCTGC3GACAG CTTGT6CTTC AACACACAAG GGTCCTTCCA 1500 

CTGTGGCTGC CTGCCAGOCT GGGTGCTGGC CCCAAATGGG GTCTCTTGCA CCATGGQGCC 1560 

TGTGTCTCTG GGACXACCAT CTCGGCCCCC CX3ATGAi3GAG GACAAAGGAG AGAAAGAAGG 1620 

GAGCACCGTG CCCCGCGCTG CAACAflCCAG TCCCACAAGO OOCCCOSAGO GCACCCCCAA 1680 

GGCTACACCC ACCACAAGTA GACCTTCQCT GTO^TCTGAC GCCCCCATC31 CATCTCCCCC 1740 

ACTCAAQATC CTGQCCXrCA 6TGGGTCCTC AGGCGTCTGG AGGGAGCCCA GCATCCATCA 1800 

OGCCACAGCT GCCTCTGGCC CCCAGGAGCC TGCAGGTGGG GACTCCTCCG TGGCCACAOi 1860 

AAACAAOGAT GGCACTGACG 6GCAAAAGCT GCTTTTATTC TACATCCTAG GCaCXXSTGGT 1920 

Q6CCATCCXA CTCCTGCTGO CCCTGGCTCT 6GGGCTACTG GTCTATCGC31 AOCGQAQAGC 1980 

GAAGAGGGAG GAGAAGAAGO AGAAGAAGCC CCAGAATGOG GCAGACAGTT ACTCC TQGGT 2040 

TCCA6AGCGA GCTGAGAGCA GGQCCATGGA GAACCAGTAC AGTCCX3ACAC CTGG6ACAGA 2100 

CTGCTGAARG TGAGGTQQCC CTAGAGACAC TAQAGTCACC AGCCACCATC CTOVGAGCTT 2160 

TGAACTCCCC ATTCCAAAGQ GGCACCCACA TTTTTTTGAA AGACTGGACT GGAATCTTAG 2220 

CAAACAATTG TRAGTCTCCT CCTTAAAGOC CCCCTGGAAC ATGCAGGTAT TTTCTAOGGG 2280 

TOTTTGATGT TCCTQAASTG OAAGCTGTGT GTTGOCOTCC CAOGGTGGGQ ATTTOGTGAC 2340 

TCTATAAT6A rrGTTACTCC CCXTOXnrrr TOVAAT^ 2400 

GGGTGT6AGG AGGCTGGGGC TAAGGGGCTC CCCTGAATAT CTTCTCTGCT CACTTOCACC 2460 

ATCTAAGAGG AAAAGGTGAG TTGCTCATGC TGATTAGGAT TGAAATGATT TGTTTCTCTT 2S20 

CCTAGOATOA AAACIAAATC AATTAATTAT TCAATTAGGT AAOAAGATCT GGTTTTTTGG 2580 

TCAAAGGGAA CATGTTOGGA CTGGAAACAT TTCTTTACAT TTOCATTCCT CCATTTCGCC 2640 

AGCACAAQTC TTGCTAAATC TGATACTQTT GACATCCTCC AQAATGGCCA GAAGTGCAAT 2700 

TAACCTCTTA GGTOGCAAGG AGGCAGGAAG TGCCTCTTTA GTTCTTACAT TTCTAATOGC 2760 

CTTGQGTTTA TTTGCAAAGG AAGCTTGAAA AATATQAGAA AAGTTGCTTO AAGTGCaWrTA 2820 

CAGGTGTTTG TQAAGTOICA TAATCTACGG GGCTAGGGCG AGAGAGGCCA OGQATTTGTT 2880 

CACAGATACr TGAATTAATT CATCCAAATG TACTGAGGTT ACCACACACT TGACTACGGA 2940 

TGTGATCAAC ACTAACAAGG AAACAAATTC AAGQACAACC TGTCrTTGAG CCAGGGCAGG 3000 

CCTCAGACAC CCTGCCTGTG GCCCOGCCTC CACTTCATCC TGCCCOSRAT OCXaGIGCTC 3060 

C3GAGCTCAGA CAGAGGAAGC CCTGCAGAAA GTTCCATCAG GCTGTTTCCT AAAGGATGTG 3120 

TOAACGGGAG ATOATOCRCT GTGTTTTQAA AGTrGTCATT TTAAAGCATT TTAGCACAGT 3180 

TCATAGTCXa^ CAlSTTGATGC AGCATCCTGA QATTTTAAAT CCTGAAGTCT GGGTGGCGCA 3240 

CACACCAAGT AGG6AGCTAG TCAGGCAGTT TGCTTAAGGA ACTTTTGTTC TCTGTCTCTT 3300 

TTCCTTAAAA TTGGGGGTAA GGAGGGAAGG AAGAGGGAAA QAGAT GACTA ACTAAAATCA 3360 

TTTTTACAGC AAAAACTGCT CAAAGCCATT TAAATTATAT CCTCaTTTTR AAAGTTACAT 3420 

TTOCAAATAT TTCTCCCTAT GATAATGCAG TCGATAGTGT GCACTCTTTC TCTCTCTCTC 3480 

TCTCTCTCAC ACACACACAC ACACACACAC AOVCACACAC AGA6ACACGG CACCATTCTG 3540 

CCTGGGGCAC TGGAACACAT TCCTGGGGGT CACOGATGGT CAGAGTCACT AGAAGTTACC 3600 

TGAGTATCTC TGGGAGGCCT CATGTCTCCT GTOGOCTTTT TACCACXaCT GTGCAOSBkQA 3660 

ACAGACaGAG QAAATCTGTC TCCCTCCAAG GCCCCaAAGC CTCAGAGAAA OGGTGTTTCT 3720 

GGTrrraCCT TAGCAATGCA TCGGTCTCTG AGGTGACACT CTGGAGTGGT TGAAGGGCCA 3780 

CAAGGTOCAG OQTTAATACT CTTGCCAGTT TTGAAATATA GATQCTATGG TTCAGA TTGT 3B40 

TTTTAATAQA AAACTAAAGG GGCAGGGQAA GTGAAAGGAA ASATGGAGGT TTrGTGCG G C 3900 

TCX3ATGGGGC ATTTGGAACT TCTTTTTAAA GTCATCTCAT GGTCTCCAGT TTTCftGrTGO 3960 

AACTCTGGTG TTTAACACTT AAGGGAGACA AA6GCT6TGT CCATTTGGCA AAACtTCCTT 4020 

GQCCACGAGA CTCTAGGTGA TGTOTQAAGC TGGGCAGTCT GTGGTGT6GA GAQCAGCCAT 4080 

CTOTCIGGCC ATTCAGAGGA TTCTAAAQAC ATGQCTGQAT GCGCTGCTGA CCAACATCAG 4140 

CACTTAAATA AAT6CAAATG CAACATTTCT CCCTCTGGGC CTTGAAAATC CTTQCCCTTA 4200 

TCATTTCGGG tGAAGQAQAC ATTTCIGTO: TTGG 4260 

GTATGATTCC TGGGATCCAA CGAGCCCTCC TATTTTCACA GTGTTCTGAT TGCTCTCACA 4320 

GCCCAGGCCC ATCGTCTGTT CTCTGAATGC AGCCCTGTTC TCAACAACAO GGAGGTCATG 4380 

GAACCCCTCT GTGGAACCCA CAAGQGGAGA AATGGGTGAT AAAGAATCCA GTTCCTCAAA 4440 

ACCTTCCCTG GCAGGCTGQO TCCCTCTCCT GCTGGGTGGT GCTTTCTCTT GC3VCACCACT 4500 
CCCACCACGG GGG6AGAGC3C AGCAACCCAA OCAOACAGCT CAOOTTOTGC ATCT6ATGGA 4560 
AACCACTGGG CTCAAACACX3 TGCTTTATTC TCCTGTTTAT TTTTOCTGTT ACTTTGAAGC 4620 
ATOGAAAtTC TTGTTTGGGG GATCTTGGGG CTACAGTAGT GQQTAAACAA ATGCCCACOG 4680 
GCCAAGAGGC CATTAACAAA TCGTCCTTGT CCTGAGGGGC CCCAGCTTGC TCGGGCGTGG 4740 
CACA6TGGGG AATCX»AGGG TCACAGTATG GGGAGAOGTG CACOCTGCCA CCTGCTAACT 4800 
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' TCTCX3CTAGA CACAGTGTTT CTGCCCAG6T GACCT6TTCA GCA6G^GAAC AAGCCAG66C 4860 

CATG6GGA08 06G6AAGTTT TCACTTGGAG ATQQACACCA AGACAATGAA GATTTGTTGT 4920 

CCAAATAG6T CAATAATTCT GGGAG^CTCT TGGAAAAAAC TQAATATATT CAGGACCAAC 4980 

TCTCTCCCTC CCCTCATCXX: ACATCTCAAA GCAGACAATG TAAAGAGAGA ACATCTCACA 5040 

CACCCAGCTC GCCATGCXTA CTCATTCCTO AATTTCAGGT GCCATCyVCTG CTCTTTCTTT 5100 

CTTCTTTGTC ATTT6A6AAA 6GATGCAGGA GGACAATTCX: CACAGATAAT CTGAGGAATG 5160 

CAGAAAAACC AGGGCAGGAC A6TTATCGAC AATGCATTA6 AACTrGGIGA GCATCCTCTG 5220 

TAGA6GGACT CChCCCCTGC TCAACAGCTT GGCTTCCAGG CAA6ACCAAC CACATCTGGT 52 BO 

CTCTGCCTTC GGTGGCCCAC ACACCTAAGC GTCATCX3TCA TTGCCATAGC ATCATGATGC 5340 

AACACATCTA CGTGTAGCAC TACGACX3TTA TGTTTGGGTA ATGTGGGGAT GAACTGCATG 5400 

AGGCTCTGAT TAAGGATGTG GGGAAGTGGG CTGCGGTCAC TGTCX36CCTT GCAAGGCCAC 5460 

CTGGAGGCCT 6TCTGTTAGC CAGTG6TGGA GGAGCAAQGC TTCAGGAA08 GOCAGOCACA 5520 

TGCCATCTTC CCTGCGATCA GGCAAAAAAG TOQAATTAAA AAOTCAAACC TTTATATQCA 5580 

TGTGTTATGT CCATTTTGCA GGATGAACTG AGTTTAAAAG AATTTTTTTT TCTCTTCAAG 5640 

TTGCTTTGTC TTTTCCATCC TCATCACAAG CCCTTGTTTG AGTGTCTTAT CCCTGAGCAA 5700 

TCTTTC6ATG GATGGAGATG ATCATTAGGT ACTTTTGTTT CAACCTTTAT TCCTGTAAAT 5760 

ATTTCTGTGA AAACTAGGAG AACAGAGATG AGATTTGACA AAAAAAAATT GAATTAAAAA 5820 

TAACACAGTC TTTTTAAAAC TAACATAGGA AAGCCTTTCC TATTATTTCT CTTCTTA6CT 5880 

TCTCCATTGT CTAAATCAGG AAAACAGGAA AACACAGCTT TCTAGCAGCT GCAAAATGGT 5940 

TTAATGCCCX: CTACATATTT CCATCACCTT GAACAATAGC TTTAGCTTGG GAATCTGAGA 6000 

TATGATCCXA GAAAACATCT GTCTCTACTT OGGCTGCAAA ACCCATGGTT TAAATCTATA 6060 

TGGTTTGTGC ATTTTCTCAA CTAAAAATAG AGATGATAAT CCXyVATTCTC CATATATTCA 6120 

CTAATCAAAG ACACTATTTT CATACTAGAT TCCTGAGACA AATACTCACT GAAGGGCTTG 6180 

TTTAAAAATA AATTGTGTTT TGGTCTGTTC TTGTAGATAA TGCCCTTCTA TTTTAGGTAQ 6240 

AAGCTCTGGA ATCCCTTTAT TGT6CTGTTG CTCTTATCTG CAAGGTGGCA ■ A6CAGTTCTT 6300 

TTCAGCAGAT TTTGOCCACT ATTCCTCTGA 6CTGAAGTTC TTTGCATAGA TTTGGCTTAA 6360 

GCTTGAATTA GATCCCTGCA AAGGCTTGCT CTGTGATGTC AGATGTAATT GTAAATQTCA 6420 

GTAATCACTT CATGAATGCT AAATGAGAAT GTAAGTATTT TTAAATGTGT GTATTTCAAA 6480 

TTTGTTTGAC TAATTCTGGA ATTACAAGAT TTCTAT6CA6 GATTTACCTT CATCCT6T6C 6540 

ATGTTTCCCA AACTQTGAGG AGGGAAOQCT CSAOAGATOQA GCTTCTCCTC TGAGTTCTAA 6600 

CAAAATGGTG CTTTGAGGGT CAGCCTTTAG GAAGGTGCAG CTTTGTTGTC CTTTGAGCTT 6660 
TGTGTTATGT GCCTATCCTA ATAAACTCTT AAACACATT 

Seq ID NO: 345 Protein sequence 
Protein AccesBion #t NP_036204 

1 11 21 31 41 51 

i i I I I I 

HATSMGLLLL LIiLLLTQPGA GTGADTEAW CVGTACYTAK 8GKLSAABAQ NHCNQNGGML 60 

ATVKSXEBAQ HVQRVLAQLL RREAALTAKM SXFHZGLQRB KQXCliOPSLF LKGFSWVQGG 120 

EDTPYSMWHK BLRNSCZSKR CVSIiLLDLSQ PIiLFNRLPKW SEGPCGSPGS PQSNIBGFVC IBO 

KFSFXGMCRP LAI^GPGQVT YTTPFQTTSS SLEAVPPASA ANVAGGEGDK DETQSHYFLC 240 

KEKAPDVFDH GSSGPLCVSP KYGCNFKKGG CHQDCPEGGD 6SFL0GCRPG FRLLDDLVTC 300 

A5SNPCSSSP CatGGATCVLG PHGKNYTCRC PQGYQLDSSQ IJ)CVDVDEOQ DSPCAQECVN 360 

TPG6FRCBCM VGYEPGGPGE GACQDVDECA LGRSPCAQGC TNTDGSPHCS CEE6YVLAGB 420 

DGTQCQDVDE CVGPG6PLCD SLCFNTQGSF ROGOiPGNVIi APIIGV8CTMG PVSLaPPSGP 480 

PDBEDX6EKE GSTVFRAATA SPTRGPE6TP KATPTTSRPS LSSDAPITSA PLKMLAPSGS 540 

SGVWREPSIH HATAASGPQE PAGGDSSVAT QNND6TDGQK LLLFYXIiGTV VAXLLtiLAIA 600 
LGLLVYRKRR AKREBKKHKK PQNAADSYSW VPERAESRAM ENQYSFTPGT DC 

Seq ID NO: 346 DNA sequence 
Nucleic Acid Accession #s Z31560 
Goding sequence: <l-966 

1 11 21 31 41 51 

] I I I 1 I 

CACAGOGCCC GCATGTACAA CATGATGGAG AGGGAGCTGA AGCCGCOGGG CCC6CAGCAA 60 

ACTTGGGGGO GCG6GG6GQG CAACTCCACC GCGG0QSC9SG CCGOOOGCAA CCAGAAAAAC 120 

AGCCOGGACC GCGTCAAGOO GCCCATGAAT GCCTTCATGG TGTGGTCCOG C6GGCAGGGG 180 

CGCAAGATGG CCCAGGAGAA CCCCAAGATG CACAACTOGG AGATCAGCAA GCGCCTGGGC 240 

GCCGAGTGGA AACTTTTGTC G6AGACGGAG AAGCGGCCGT TCATCGACGA GGCTAAGCGG 300 

CTGCGAGCGC TGCACATGAA G6AGCACCC6 GATTATAAAT ACCGGCCCGG GCGGAAAACC 360 

AAOAGGCTCA TGAAGAAGGA TAAGTACS^ CTGCCGGGOQ 06CT0CTGGC CCCOQGCGGC 420 

AATAGCATGG C6AGGGGGGT CGGGGTGGGC GCC9GGCCTG6 GOGGGGGCGT GAACCAQGGC 480 

ATGGACAGTT ACGCGCACAT GAACGGCTGG AGCAACGGCA GCTACAGCAT GATGCA6GAC 540 

CAGCTGGGCT ACCCGCAGCA CCOGGGCCTC AATGCXSCAOS GCGCAGOSCA GATGCAGCCX: 600 

ATGCACCGCT ACGACGTGAG CX3CCCTGCAG TACAACTCCA TGACCAGCTC GCAGACCTAC 660 

ATGAAOGGCT 06CCCACCTA CAGCATOTCC TACT06CAGC AGGGCACCOC TGGCAT6GCT 720 

CTTGGCTCCA TCG6TTCGGT GOTCAAGTCC GAOGCCAGCT CCRGCC3CCCC TGTGGTTACC 780 

TCTTCCTCCC ACTCCAGGGC GCCCTGCCAG GCCGGGOACC TCCGGGACAT GATCAGCATG 840 

TATCTCCCOG GOSCCGAGGT GCCGGAACCC GCCGCCCCCA GCAGACTTCA CATGTCCCAG 900 

CACTACCA6A GCGGCCC6GT GCCCX3GCA0G GCCATTAACG 6CACACTGCC CXTTCTCACAC 960 

ATGTGAGG6C OGGACAGOGA ACTG6AGGG6 GGAOAAATTT TCAAAGAAAA ACGA6GGAAA 1020 

TGGGAGGGGT GCAAAAGAG6 AGAGTAAGAA ACAGCAIGGA GAAAAGOCGG TAC6CTCAAA 1080 
AAAAA 

Seq ID NO: 347 Protein sequence 
Protein Acceesion CAA83435 

1 11 21 31 41 51 

1 1 I I I I 

HSARMYNMHE TBLKPPGPQQ TSG6GGGNST AAAAGGNQKN SPDRVKRPMN AFKVWSRGQR 60 

RKHAQENPKM HNSEISKRI/3 AEMKLLSBTE KRPFIDEAKR LRALHMKKHP DYKYRPRRKT . 120 

KTLMKKDKYT LPQGIiLAPGG NSMASGVGVG AGLGAGVNQR MDSYAHMN6W SNGSYSMMQD 180 

QL6YPQHPGL NAHGAAQMQP MHRYDVSALQ YNSMTSSQTY MNGSPTYSMS YSQQGTPGMA 240 

LGSMGSWKS EASSSPPWT SSSHSRAPOQ AGDLROIZSN YLPGAEVPEP AAPSRLHMSQ 300 
HSQSGPVPGT AINGTLPIiSH N 
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wo 02/086443 



Seq ID NO: 348 DMA sequence 
Nucleotide AccesBion Ut NM_002638 
Coding sequence: 120-473 

1 il 21 31 41 51 

I I I I I I 

CAATACA6CT AAGGAATTAT CCCTTGTAAA TACCACAGAC COGCCCTGGA GCCAGGCCAA 60 

GCTGGACTGC ATAAAGATTG GTATGGCCTT AGCTCTTAQC CAAACACCTT CCTGACACCA 120 

OXSAGGGCCAG CAGCTTCTTG ATCX3TQGTGG TGTTCCTCAT 0GCT6GGA0Q CTGGTTCTaO 180 

AGGCAGCTGT CACGGOAGTr CCTGTTAAAQ GTCAAGACAC TGTCAAAGGC OGTGTTCCAT 240 

TCAATGGACA AGATCCCOTT AAAGGACAAG TTTCAGTTAA AGGTCAAGAT AAAGTCAAAG 300 

OGCAAOAQCC AGTCAAAGGT CCAGTCTCCA CTAAGCCTGG CTCCTGCCCC ATTATCTTGA 360 

TCCGGTGOGC CATGTTGAAT CCCCCTAACC GCTGCTTGAA AGATACTGAC TGCCCAQ6AA 420 

TCAAGAAGTG CTOTGAAQGC TCTTGCGGGA TGGCCTGTTT CGTTCCCCAG TGAAGGOftGC 480 

OGGTCCTTGC TGCACCTGTG CCGTCCCCAG AGCTACAGGC CCCATCTGQT OCTAAGTOCC S40 

tgctgoxtt ccccttccca cactgtccat tcttcctccx: attcaogatg occaogoctg 600 
gagctgcx:tc tctcatccac tttccaataa a 

Seq ID 190 1 349 Protein sequence i 
Protein Accession #t NP_002629 

1 11 21 31 41 51 

I I I I I I 

NRASSFLIW VFLXAGTLVL EAAVTGVPVK GQDTVKGRVP FNGQDPVKGQ VS VKGQD gVK 60 

AQEPVHBPVS TKPGSCPIZL IRCAMZiNPFN RCItKDTDCPG IKKCCEGSCG NACFVPQ 



Seq ID NO: 350 ONA sequence 
Nucleic Acid Accession 9t NM_0071B3 
Coding sequence t 75-2468 



1 11 21 

I I I 

OAATTCCCSGA CAOQACQTOA AGATAOTTGG 
GT6GACCTGC 08CCAT6CA8 QAOQGTAACT 
GOGTOTGCTC CCTGGCGCTQ CCCTCTGACC 
C6GAG6CCGA GOGGCTGCGG GCA6CCC6CG 
AaCTGGGACA QCA6CCGCG0 CACAAGGGG6 
CCAGAG8CAC ATOAGGGGG CAGIACCACA 
AGGGCCTGAO TGGG6ACAAG ACCTGGGGCT 
CAOCCrCCTG GTCCTCCCGC TCCGCCGTGG 
CCCACAATGG GGGCAGCGCC TTTGGGGCOQ 
CCAT6CCCAC CAG6CCCGT6 TCCTTCCAT6 
ATQACftCACT CT C CCTOCGC T06CTGCGGC 
6CCTG6TQTC TOAGCAGCTG GA0CCC6CG6 
AGCGCCAGGC CAGCTCCAGC TCCAGCCGGG 
AGGTTTCCCC GAGCCGGACC ATCCGTGCCC 
GCAGCCACCQ GAGCGGGGOG GTAGGOGGGG 
CT0QA8GGCC ATCIGTOGGC AGCCTCAOCC 
ACGT6CATGG GT7CAACAGC TAG66TA60C 
TTQATQACAT TGACCTGCCC TCAGCAGTCA 
AGGTGCTGGG A60GGCCTAC ATCCAGCACA 
AGGCCC6CAG CCTTCAGGCC GTGCCTAGGC 
AAGT6CAQCG CCATGCCACA GGT6CCATGC 
A6CTGGCCCT GGTGGAGGAG AAOSGGATCT 
ATGATGA6CT TCGCAAAAAT GTCACAGGGA 
TGAAGGACCG CCTGGCCftGA QACAOGCTGG 
TGTCGGGGGC TGGGGGTCCC CCCCTCATCC 
ACAAOQCCAC CGGCTTCCTC AGGAACCTCA 
TGGGGGAGTG CCACGGGCTG GTGGACGCCC 
C6GGCAAAT6 C6A6QACAAG AGCGTGGAGA 
ACCGCCTCTA OGACQAGATG CCGCOGTCCG 
GGGACCTGGC G6GGGC6C06 CCGGGAGA6G 
GGCTGCGOGA GCTGCCCCTC 6CCGCCX3ATG 
CCAAGGGCCT CGAGTGGCTG TGGAGCCCCC 
AG06CTGCGA GCTCAACCGG CACACGACOG 
CGGCAGGCGA CCGCAGGTGG GOGGGGGTOC 
TTCTCAACCC CCTGCTAGAC CGTGTCAGGA 
CTGGCCTCAT CCGAAACCTG TCTCGGAAOG 
TGGTGAGCCA CCTGATCGAG AAGCTGCCAQ 
AGGTGCTGGT CAACATCATA GCTGTGCTCA 
CCOGAGAGCT 6CTGTA7TTT GAGGOACTCC 
ACAGCCCGQA CAGTGAGAAG TCCTCC0QG6 
AGTACAACAA GCTCCACCGT GACTTTCGGG 
QCCCATAGGT QAAGCCTTCT GGAGQAGAAG 
TCAGCTCCA6 GCTGCTTGGC A6CCCAGCCT 
TCGCTGGGGC CCCXGTGTGC ATCTTTGA06 
ATAGCTGG6G ACTTG6CTTC CGCA6GGCA6 
TGTATGGGGT GGTGACCCAG TCACATTGGC 
ATCTTGGGAT AGCCAGCACT GGGAATAAAO 
AAAAGGAATT C 



31 41 51 

I I I 

GTTTGQAGGC GSCOOCCAGG CCCA6GCCCQ 60 

TCCTGCT6TC 66CCCT6CAQ CCTQAGOOOO 120 

TGCA6CT6GA CCGCCGG66C GCC6AGGG6C 180 

TCCAGGAGCA GGTCOGOGCC OGCCTCTTGC 240 

CCSSCTGAGCC CGA6CCT6A6 6CCQAGACTG 300 

CCCT6CAG6C TGGCTTCAGC TCTGGCTCTC 360 

TCC6GCCCAT OSCCAAGOCQ QCCTACAGCC 420 

ATCTGAQCTG CA6T0GGAGG CTGAGTTCAG 480 

CTGGGTACGG GQGTGCCCAG CCCACX:CCTC 540 

AGC6GQGTGG GGTTGGGAGC CGG6CCGAC7 600 

TGCOGCCCQO GGGCCTGGAC GACCGCTACA 660 

CCACCTCCAC CTACAGGGCC TTTGCGTACG 720 

CAGGGGGGCT GGACTGGCCC GAGGCCACTG 780 

CTGCOGTGCG QACCCTGCAG CGATTCCAGA 840 

CAGTGC0GG6 GGCCGTCCTG GAGCCAGTGG 900 

TCAGCCT6GC TGACTCG6GC CACCTGCCGG 960 

ACOGAACCCT 6CAGAGACTC AGCAGOGGTT 1020 

A6TACCTCAT GGCTTCAQAC CCCAACCTGC 1080 

AGTGCTACAG CGATGCAGCC GCCAAGAAGC 1140 

TGGTGAAGCT CTTCAACCAC GCCAACCAGG 1200 

GCAACCTCAT CTACGACAAC GCTGACAACA 1260 

TCGAGCTGCT G066ACACT6 C6GGAGCAGG 1320 

TCCTGTGGAA CCTTTCATCC AGCGACCACC 1380 

AGCA6CTCAC GGACCTGGTG TTGAGCGOCC 1440 

AGCAGAACGC CTCG6AG6CG GAGATCTTCT 1500 

GCTCAGCCTC TCAGGCCACT CGCCAGAAGA 1560 

TGGTCACCTC TATCAACCAC GCCCTGGACG 1620 

ACGGGGTGTG CGTCCTGG6G AAGCTBTCCT 1680 

CGCTGCA6CG GCTGGAG6GT CGCGQCGGCA 1740 

T06TGGGCT6 CTTCAC6CCG CAGAGCCGGC 1800 

CGCTCACCTT CGCGGAGGTG TCCAAGGAOC 1860 

AGATCGTGGG GCTGTAC3VAC CGGCTGCTGC 1920 

AGG0G6C0GC GGGG6CGCTG CAGAACATCA 1980 

TGAGCCaOCT G6CCCIGQAQ CAGGAGCGTA 2040 

CC6CC6ACCA CCACCA6CTG 06CTCACTGA 2100 

CTAGOAACAA QGACGAGATG TCCAC6AAGG 2160 

GCAGCGTGGG TGAGAAGTCG CCCCCAGCCG 2220 

ACAACCT6GT GGTGGCCAGC CCCATC6CTG 2280 

GAAAGCTCAT CTTCATCAAO AAGAAGOGOa 2340 

CAGCATCCAG CCTCCTG6CC AACCTGTGGC 2400 

CGAAGGGCTA TCGGAAGGAG GACTTCCTGG 2460 

GTGA06TGGC CCAGOQTCCA AGGGACAGAC 2520 

GGAGGAQAAG GCTAATOAGG GAGGGGCCCC 2580 

GTGCTG60CC ACCAOaAOGG GCAG6GTCTT 2640 

GGGGTGGGGC AGGGCTCAAG GCTGCTCTGG 2700 

AGAGGTGGQa GTTGGCTGTG GCCTG6CAGT 2760 

ATG6CCATQA ACAOTCACAA AAAAAAAAAA 2820 



Seq ID NOt 351 Protein sequence 
Protein Accession NP_009114.1 

1 11 21 31 41 51 
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I . 11 

MQOGNFLtiSA LQPEAGVCSL ALPSDLQLDR 
FRHHGAAEPB PBAETARGTS RGQYHTLQA6 
SRSAVDLSCS RRLSSAHNGG SAFGAAGYGG 
LRSLRLGPGG LDDRYSLVSE QLEPAATSTY 
RTIRAPAVRT LQRFQSSHRS RGV6GAVPGA 
NSYOSHRTLQ RLSS6FDDID IiPSAVICn.NA 
QAVPRLVKLF NKANQEVQRR ATGAMRNLIY 
KNVTGILWNL SSSDHLKDRL ARDTLEQLTD 
FliRNliSSASQ ATRQKMRECH GLVDALVTSI 
BMPPSALQRL £X3RGRRDIiA6 APPGEW6CP 
NLNSFQIVQb YNRIiLQRCBL NSBTTBAAAG 
IiDRVRTADHH QLRSLT6LIR NLSBKABNKD 
IIAVLNNLW A8PIAAR0LL YFDGLRKLZF 
HRDFRAKGYR KEDFLGP 



Seq ID NO: 352 DNA sequence 
Nucleic Acid Accession M31469 
Coding sequence: 1-651 



ROAESPEAER 
FSSR9Q6LSG 
AQPTPPMPTR 
RAFAYERQAS 
VLEPVARAPS 
SDPNLQVLGA 
DNADNKLALV 
LVLSPIiSGAG 
KHALDAGXCE 
TPQSRRLREI) 
A LQNir aGPR 
EMSTKWSHL 
IXKKRDSPDS 



I 

IiRAARVQBC2V 
DKTSGFRPIA 
PVSFHBRGGV 
SSSSRAGGLD 
VRSLSIjSLAD 

ayiqhkcysd 
eeugzfellr 
gppliqqnas 
dksvenavcv 
plaadaltfa 
snagvlsrla 



PCT/US02/12476 



EKSSRAASSL 



RARLLQLGQQ 
KPAY8PA8NS 
GSRADYDTLS 
WPEATEVSPS 
SGRLPDVH6F 
AAAKKQARSL 
TIiREQDDBLR 
EAEIFYNATG 
LR27LSYRLYD 
EVSKDPKGLE 
LEQERIINPL 
KSPPAEVLVN 
lANLNQniKIi 



1 
I 

ATGGCT6C6C 
ACTGGAAAAA 
GCCACCTT66 
TTCAATGTAT 
ATCCAAGCCC 
GTGCCTAACT 
GGCAACAAAG 
AAGAAGAATC 
TTCCTCTGGC 
GCTCTCGCCC 
TTAGA66TTG 



11 
1 

AGG6AGAGCC 
OQACCTTCQT 
GTGTTGAGGT 
GGGACACAGC 
AGTGTGCCAT 
GGCATA6AGA 
TOQATATTAA 
TTCAGTACTA 
TTGCTAGGAA 
CACCAGAAGT 
CTCAGACAAC 




41 
I 

TATTGGTTQG 
AATTTGAGAA 
CCAACA6AGG 
GACTGAGAGA 
C6AGAGTTAC 
ACATCCCCAT 
AATCCATTGT 
ACTACAACTT 
TG6AATTTGT 
CAGCACAGTA 
ATGACCTGT6 



51 

1 

T6ATGGTGGT 
GAAGTATGTA 
ACCTATTAAG 
TGGCTATTAT 
TTACAAGAAT 
TGTGTTGTGT 
CTTCCACCGA 
TGAAAAGCCC 
TGCCATGCCT 
TGAGChCGAC 
A 



8eq ID NOt 353 Protein sequence 
Protein Accession AAA36546 

1 11 21 31 41 51 

I I I I I I 

MRAQGEPQVQ PKLVLVGDGG TGKTTPVKRH LTGBPEKKXV ATLGVEVHPL VPHTNRGPIK 
FNVWDTA6QE KFGGLRDGYY IQAQCAIIMF DVTSRVTYKN VFNHHRDLVR "VCBNIPIVLC 
GNKVDIKDRK VKAKSIVFHR KKNLQYYDIS AKSNYNFBKP FLWLARKLZG DPNLEFVAMP 
ALAPPEWMD PALAAQYEHD LEVAQTTALP OEDDOIt 



Seq ID NO: 354 DNA sequence 
Nucleic Acid Accession #: NM_002820 
ceding sequence: 304-831 ~ 



1 
I 

CGGOTTCGCA 
CCCTGTTCCA 
CGTGTAAACA 
TTCAGAGGAA 
GTTTG6AGAA 
ACGATGCAOC 
QTGCCCTGCT 
6AACATCAGC 
CTTCACCATC 
CCTAACTCCA 
GAGGGCAGAT 
AAGACACCTG 
AAACGGCQAA 
GACCACCTQT 
CTGGCCOGTA 
GCTTGGACAA 
CAGAGAATAA 
TGTCCTCCAG 
CATCAATCCT 
ATCTTCATAA 
TTCTTCAGTG 
GATATTATCT 
ACTTTTTATT 
TAAATTATGT 
CCA6CTCATA 
GGTTTTTCTC 
COGTAOGAAA 



11 
I 

AAGAA6CZQA 
CGAACCCAGG 
CACTACTTAT 
GCGCCTCTGA 
AGCACAGTTG 
GGAGACTGGT 
GCGGGC6CTC 
TCCTCCATQA 
TGATCGCAGA 
AGCCCTCTCC 
ACCTAACTCA 
GGAAGAAAAA 
CTCGCTCTGC 
CTGACACCTC 
GCCTOVGCSGG 
ACCTAGAATT 
CTCAGAATAT 
CACCATAGAG 
TTACCACTCT 
TTTGCTGGAG 
TTTTTCATTT 
ACAAACACT6 
TAATTAAATG 
TTTAAACACA 
CAAAATAAAT 
ATQTATCTTT 
AATAAAACTT 




AGGCGCTAGA 
ACCAAATAAT 
AAGTGTATTT 
CTTACX3TTCT 
CAGAACAGCA 
TATTTAATTA 
TGCCTTAAAT 
GGTTTCTGAA 
TTGTTCATTG 
CACATTTAAA 



31 
I 

GGAAACTTTC 
GCCAGATTAA 
TATATAAAAC 
TTTTCCCTTT 
TT6CTAAATA 
AG0GTOGC6G 
CTCAGCCGCC 
TCCATCCAAG 
GCTQAAATCA 
AACCACCCCG 
AAGGTQGAGA 
CCC6GGAAAC 
TCTGGAGTGA 
CTGGAGCTCG 
GCTGGGTTTT 
ATGTATCTCT 
AAAGCAGTAC 
GCCCATTCCr 
TTCATATTCA 
CTTCCCCTTA 
TTCACTTCAA 
TCATGTCATA 
AATCTCAAAT 
TTGTTTAATT 
AATGTTTAAG 
GCAAGATGIU^ 



41 

I 

TTCTTTTAGG 
TTAGACATT6 
CATTTTATTT 
TTGCTCTTTC 
AQTCCXXSAGC 
TGTTOCTGCT 
GCCTCAAAAG 
ATTTACGGCG 
GA6CTACCTC 
TCCGATTTGG 
GGTAOUUUSA 
GCAAGGAGCA 
CT6G6AGTGG 
ATTCACGGTA 
GGAGCCTCCC 
ATOGATTGTG 
CCCCCTACCA 
CTTTCTCCAC 
AGCTTCAGAA 
CTCTCACACC 
GGGAGAATAT 
AACX5ATTCTG 
TTATTTTAAT 
AAATTTAACT 
TATTAACTTA 
ATAATTTTTC 



51 

I 

AGGCGGTTAG 
CTATGGGAGA 
TOGCTATTAT 
TGGCTGTGTG 
GCQAGCQGAG 



A6CTGTGTCT 
AOGATTCTTC 
GGAGGTGTCC 
GTCT6ATGAT 
GCAGCCGCTC 
GGAAAAQAAA 
GCTAGAAGGG 
ACAGGCTTCT 
TTCTGCCTTG 
TAGCAATT6A 
CACACACCCC 
CGTCACCCAA 
GCTAGTGACC 
TGGGCAAACT 
A6AAGCATTT 
AGCCATTCAC 
GTAAAGAACT 
CTGGTTTCTA 
CAAGGATATA 
TAGGGTAATG 



Seq ID NO I 355 Protein sequence 
Protein Accession #t NM_002820 

1 11 21 31 41 SI 

) 1 ) I I i 

HQRRLVQQW8 VAVFLI«SYAV PSCX3RSVEGL 8RRLXRAV8E BQLLHDKGKS IQDIARSFFL 
HHLIAEIBTA BIRATSBVSP NSXFSPNTKN BPVRFGSDDE ORYLTQETMR VSTYKEQPIiK 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 



60 
120 
180 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
13S0 
1440 
1500 
1560 



60 
120 



319 



wo 02/086443 

TPGKKKKGKP GKRKEQBKKK RRTRSAWLDS GVTGSQLEGD HLSDTSTTSL BLDSR 



8eq 10 NO I 356 SNA sequence 
Nucleic Acid Accession NM_017522 
coding sequence I 1-2100 

I 11 21 ' 31 41 51 

1 I I I 1 I 

ATG6GCCTCC CCGA6CC0GG CCCTCTCCGG CTTCTGGCGC TGCTGCTGCT 6CT6CTGCTG 60 

CTGCTGCTGC TGCGGCTCCA GCATCTTGOS GCGGCAGCGG CTGATCOQCT GCTCGQGG6C 120 

CAAGGGCCG6 CCAAGGAGTG CGAAAAGGAC CAATTCCAGT GCCGGAACOA GC6CIGCATC 180 

OCCTCTGTGT GGAGATGCGA CGAGGACGAT GACTGCTTAG ACCACAGCGA 06AQQACGAC 240 

TGCCCCAAGA AGACCTGTGC AGACAQTGAC TTCACCTGTG ACAACX3GCCA CTGCATCCAC 300 

GAA0G6TGGA AGTGTGAOGG CGAGGAGGAG TGTCCTGATG GCTCCGATGA GTCC6AGGCC 360 

ACTTGCACCA AGCAGQTGTG TCCTGCAGAO AAGCTGA6CT GTGGACCXAC CRGCCACAAG 420 

TGTGTACCTG CCTOGTGGCG CTGCGAOGGG GA6AAGGACT GCGAGGGTGG AGCGGATGAG 480 

GCCGGCTQTG CTACCTCACT QGGCACCTGC OGTGGGGAOG AGTTCCAGTG TGGG6ATGGG 540 

ACATOTGTCC TTGCAATCAA QCACTGCAAC CAGGAGCAGG ACTGTCCAQA TGGGAGTQAT 600 

GAAGCTGGCT GCCTACAGG6 GCTGAA0BA6 TGTCTGCACA ACAATQGOGG CTGCTCACAC 660 

ATCTGCACTG ACCTCAAGAT TGGCTTTGAA TGCftGGTGCC CAGCAOOCTT OCAGCTCCTG 720 

GACCAGAAGA CTTGTGGCGA CATT6AT6AG T6CAAGGACC CAGATGCCTO CAGCCftGATC 780 

TGTGTCAATT ACAA6GGCTA TTTTAAGTGT GAGTCCTACC CTGGCTGOSA GATGGACCTA 840 

CTGACCAAGA ACTGCAAGGC TGCTGCTGGC AAGAGCCCAT CCCTAATCTT CACCAACCGC 900 

ACX5AGTGCGG AGGAT06ACC TGTGAAGCQG AACTATTCAC GCCTCATCCC CATGCTCAAG 960 

AATGTCGTGG CACTAGATGT GGAAGTTGCC ACCAATCX3CA TCTACTGGTO TSACCTCTCC 1020 

TACCGTAAGA TCTATAOCGC CTACATGQAC AAGGCCAGTO ACCCGAAAGA 60GGQAGGTC 1080 

CTCATTQACG AGCAGTTGCA CTCTCCAGAG G6CCTGGCAG TGGACTGGGT CCAC3UU3CAC 1140 

ATCTACTGGA CTGACTCGGG CAATAAQACC ATCTCAGTGG CCACAGTTGA TGGTGGCOSC 1200 

G6A0GCACTC TCTTCAGCC3G TAACCTCAGT GAACOCOGGG CCATOGCTGT TGACCCCCTG 1260 

OQAaQGTTCA TGTATTGGTC TBACTGGGGG GACCAGGCCA AGArrGAOAA ATCTOGGCTC 1320 

AACGGTGTGG ACCGGCAAAC ACTGGTGTCA GACAATATTG AATG6CCCAA OGGAATCACX! 1380 

CTGGATCTGC T6A6CCAGCG CTTGTACTGG GTAGACTCCA AGCTACACCA ACTGTCCAOC 1440 

ATTGACTTCA GTGGAGGCAA CAGAAAGACG CTGATCTCCT CCACTGACTT CXTTGAGCCAC 1500 

CCTTTTG6GA TAGCTGTGTT TGAGGACAAG GTGTTCTG6A CAGACCTGGA GAACX3AGGCC 1560 

ATTTTGAlGrG CAAATOQGCT CAAT06CCTG GAAATCTCCA TCCTG6CTGA GAACCTCAAC 1620 

AACCCACATG ACATT6TCAT CTTCCATGAG CT6AA8CAGC CAA6AGCTCC . AGATGCCTGT 1680 

GAGCTGAGTC TCCAGCCTAA TOQAGGCTOT GAATACCTGT GCCTTCCTGC TCCTCAGATC 1740 

TCCAGCCACT CTCCCAAGTA CAC3VT6TGCC TGTCCTQACA CAATGTGGCT GGGTCCAGAC 1800 

ATGAAQAGGT GCTACXrCAGA TGCAAATGAA 6ACAGTAAGA TGGGCTCAAC AGTCACT6CC 1860 

GCTOTTATCG GGATCATGGT QCCCATAGTG GTGATAGCCC TCCT GTGC AT GAGTGGATAC 1920 

CrOATCTGGA GAAACTGGAA GC6GAA6AAC ACCAAAA6CA TQAATTTTGA CAACCCA8TC 1960 

TACAGGAAAA CAACAGAAGA AGAAGATGAA GATGAGCTCC ATATAGGGAG AACTGCTCAQ 2040 

ATTGGCCATG TCTATCCTGC ACGAGTGGCA TTAAGCCTTQ AAGATGATGG ACTACCCTGA 2100 

G6ATGGQATC ACCCCCTTOO TGCCTCATGG AATTCAGTCC CATGCACTAC ACTC06GATG 2160 

aTGTATOACT GGAT6AA7GG GTTTCIATAT ATGQGTCTQT GTQA0T6TAT GTGTGTGT6T 2220 

GATTTTTTTT TTTAAATTTA T0TT6GGGAA AOGTAACCAC AAAGTTAT6A TGAACT6CAA 2280 

ACATCCAAAG GATGTGAGAG TTTTTCTATG TATAATGTTT TATACACTTT TTAACTGGTT 2340 

GCACTACCCA TGAGGAATTC GTGGAATGGC TACTGCTGAC TAACATGATG CACATAACCA 2400 

AATGGGGGCC AATGGCACAO TACCTTACTC ATCATTTAAA AACTATATTT AC3U3AAGATO 2460 

TTTGOTTCCT G6GGGGCTTT TTTAGGTTTT GGGCATTTGT TTTTTOTAAA TAAQATGATT 2520 
ATGCTTTGT6 6CTATCCATC AACATAAGT 

Seq ID NO: 357 Protein sequence 
Protein Accession NP_0S9992 

1 11 21 31 41 51 

I 1 I 1 I 1 

MGLPBPGPLR TfT f ftT-T'T-T-T.r.t. LLLLRLQHLA AAAADPLL66 QGPAKBCBKD OFQCRNBRCI 60 

PSVWRCDEDD DCLDHSDEDD CPKKTCADSD FTCDNGHCIH ERHKCDGEEE CPDQSDESEA 120 

TCtKOVCPAE KLSCQPTSHK CVPASWRCDG EKDCEGGADE AGCATSLGTC RGDEFQCGDG 180 

TCVLAIKHCN QBQDCPDGSD BAGCLQGI2IE CLHNNGGCSH ICTDLKIGFE CTCPAGFQLL 240 

DQKTOGDIDE CKDPDACSQI CVNYKGVPKC ECYPGCENSDL LTKNCKAAAG KSPSLIPTNR 300 

TSABDRPVKR NYSRLIPMLK NWALDVEVA TNRIYWCDLS YRKIYSAYMD KASDPKEREV 360 
LIDEQIiHSPE GLAVDWVHKH lYWTDSGNKr ISVATVDGGR RRTLPSRNLS EPRAIAVDPIi* 420 

RGFMYW8DWG DQAKIEKSGL NGVDRQTLVS DNIEWPHGIT LDLLSORLYN VDSKLHQL68 480 

IDPSGGNRKT XiISSTDFLSH PFGIAVFEDK VFWTDLEHEA IPSANRLNGL EISILAENIiK 540 

NPHDIVIFHE LKQPRAPDAC ELSVQPNGGC EYLCLPAPQI SSHSPKXTCk CFDT^WL6PD 600 

MKRCYSDANE DSKMGSTVTA AVIGIIVPIV VIALLCM5GY LXWRNHKRKN TKSMNFDNPV 660 
•YRKTTEEEDE DELHI GRTAQ IGHVYPARVA LSLEDDGLP 

Seq ID NO: 356 DNA sequence 
Nucleic Acid Accession #: M27B26 
Coding sequence: <l-503 

1 11 21 31 41 51 

I I I I I I 

AGCCCAAGAA ACATCTCACC AATTTCAAAT CTGATCTATT CGGCTTAGOG ACTGAAGATT 60 

GAOGCTQOCC QATCGCCTOQ 6AAGTCCCCT 66ACCATCAC AGAAGCCGA6 CTTCGGGTAA 120 

CTCTCACA6T GGAGGGTAAG TCCATCCCCT GTTTAATCGA TAOGGGGGCT ACGCACTCCA 180 

CGTTGCCTTC TTTTCAAGGG CCTGTTTCCC TT6CCCCCAT AACTGTTGTQ GGTATTGAOG 240 

GCCAA6CTTC AAAACCCCTG AAAACTCCCC CACTCTGGTG CCAACTTGGA CAACACTCTT 300 

TTATGCACTC TTTTTTAGTT ATOCCCACCT GCCCACTTCC CTTATTAGGC CGAAATATTT 360 

TAACCAAATT ATCTGCTTCC CT6ACTATTC CTG6AGTACA 6CTACATCTC ATTGCTGCCC 420 

TTCTTCCCflA TCCAAAOCCT CCTTTGTGTC CTCTAACATC CCCACAATAT CAGOCCTTAC 480 

CACAAQACCT CCCTTCAGCT TAATCTCTCC CACTCTAGGT TCCCACGCCG CCCCTAATCC 540 

CACTTGAAGC AQCCCTGAQA AACATOGCCC ATTCTCTCTC CATACCACCC CCCAAAAATT 600 

TTCGC06CTC CAACACTTCA ACACTATTTT GTTTTATTTG TCTTATTAAT ATCAGAAGGC 660 
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AGGAATGTCA GGCCTCTGAG CCCAGGCCAG GCCA.TOGCAT CCCCTGTGAC TTGCACX3TAT 720 

ACATCCAGAT GGCCTGAAGT AACTGAAGAT CCACAAAAGA AGTAAAAACA GCCTTAACTG 760 

ATGACATTCC ACCATTGTGA TTTGTTCCTG CCCOVCCCTA ACTGRTCAAT GTACTTTGTA 840 

ATCTCCCCCA CCCTTAAGAA GGTTCTTTGT AATTCTCCCX: ACCCTTGAGA ATGTACTTTG 900 

TGAGATCCAC CCCTGCCCAC CAGAGAAC3UV CCCCCTTTGA TTGTAATTTT TTATTAC3CTr 960 

CCCAAATCCT ATAAAACAGC CXZCAOCCCTA TCTTCXTTTCA CTQACTCTCT TTTCGGACTC 1020 
AGCCACCGGC ACCCAGGTGA AATAAACAGC TTTATTGCTC AC 

Seg ID NO: 359 Protein sequence 
Protein Accession ft: AAA65999 

1 11 21 31 41 51 

1)111) 

PKKKLTNFKS DLFGLATEOW RCPZASEVFW TITBABLRVT LTVEGKSIPC LIDTGATHST 60 

LPSFQGPVSL APITWGZDO QASKPLKTPP LWCQIiGQHSF MHSFLVIPTG PLPLLGRNIL 120 
TRLSASLTIP 6VQLHLIAAL LPNPKPPLCP LTSPQYQPLP QDLFSA 

Seq ID NO: 360 DNA sequence 
Nucleic Acid Accession #: nm_001854 
Coding sequence: 162-5582 

1 11 21 31 41 51 

I ! i I I 1 

AACCATCAAA TTTAGAAGAA AAAGCCCTTT GA CmTi 'CC CCCTCTCCCT CCCCAA76GC 60 

TCTGTAGCAA ACATCCCTG6 CC3ATACCTTG GAAAGGAGGA AGTTGGTCTG CA6TGGCAAT 120 

TTOGTGGGTT GAGTTCACAG TTGTGAGTGC GGGGCTOGGA GATGGAGCCG TGGTCCTCTA 180 

GGTGGAAAAC GAAACGGTGG CTCTGGGATT TCACCGTAAC AACCCTCGCA TTGACCTTCC 240 

TCTTCCAAGC TA6AGAGGTC AGAG6AGCTG CTCCAGTTGA TGTACTAAAA GCACTAGATT 300 

TTCACAATTC TCCAGAGGGA ATATCAAAAA CAACGGGATT TTGCACAAAC AGAAAGAATT 360 

CTAAAGGCTC AGATACTGCT TACftGAOTTT CAAAOCA^ ACAACTCAGT GCOCCAACAA 420 

AACAGTTATT TCCAGCTGGA ACTTTCCCAG AAGACTTTTC AATACTATTT ACAGTAAAAC 480 

CAAAAAAAGG AATTCAGTCT TTCCTTTTAT CTATATATAA TGAGCATGGT ATTCAGCAAA 540 

TTGGTGTTGA G6TTGGGAGA TCACCTGTTT TTCTGTTTGA AQACCACACT GGAAAACCTG 600 

CCCCAGAAGA CTATCCCCTC TTCAGAACTO TTAACATCOC TGACOGQAAG TGGCATOGGG 660 

TAGCAATCA6 GSTGQAGAAO AAAACTOIGA CAATGATTGT TOATTOTAAQ AAGAAAACCA 720 

G6AAACCACT TOATAGAAQT GAGAGAGCAA TT6TTQATAC CAATG6AATC AGG0TTTTT6 780 

QAACAAQGAT TTTGGATGAA GAAOTTTTTG AGGGGGACAT TCAGCAGTTT TTGATCACAG 840 

GTGATCCCAA GGCAGCATAT QACTACTGTG AGCATTATAG TCCAGACTGT GACTCTTCAG 900 

CACCCAAGGC TGCTCAAGCT CAGGAACCTC AGATAQATGA GTATGCACCA GAGGATATAA 960 

TOGAATATGA CTATGAGTAT GGGGAA6CAG A6TATAAAGA GGCIGAAAGT OTAACAGAGG 1020 

GACCCACTGT AACTGAGGAG ACAATAGCAC AGA0GGA6GC AAACATCGTT GATQATTTTC 1080 

AAGAATACAA CTATGGAACA ATQGAAAGTT ACCAGACAGA A6CTCCTAGG CATGTTTCTO 1140 

GGACAAATGA GCCAAATCCA GTTGAAGAAA TATTTACTGA AGAATATCTA AOGGOAGAGG 1200 

ATTATGATTC CCAGAGGAAA AATTCT6AGG ATACACTATA TGAAAACAAA 6AAATAGACQ 1260 

GCAGGGATTC TGATCTTCTG GTAGATGGAG ATTTAGGCGA ATATGATTTT TATGAATATA 1320 

AAGAATATGA AGATAAACCA ACAAGCCCCC CTAATGAAGA ATTTGGTOCA GGTGTACCA6 1380 

CAGAAACTGA TATTACAGAA ACAAGCATAA ATGGCCATGG TGCATATGGA GAGAAAGGAC 1440 

AGAAAGGAGA ACCAGCAGTG GTTGAGCCTG GTATGCTTGT CGAAGGACCA CCAGGACCAG 1500 

CAGGACCTGC AGGTATTATG GGTCCTCCAG OTCTACAAGG CCCCACTGGA CCCCCTGGTG 1560 

ACCCTGGCGA TAGGGGCCCC CCAGGACGTC CTGGCTTACC AGGQGCTGAT GGTCTACCTG 1620 

GTCCTCCTGG TACTATGTTG ATGTTACCGT TCC6TTATGG TGGTGATGGT TCCAAAGGAC 1680 

CAACCATCTC TGCTCAGGAA GCTCAGGCTC AAGCTATTCT TCA6CAG6CT OGGATTGCTC 1740 

TGAGAGGCCC AOCTGGCGCA ATG6GTCTAA CIGQAAGACC AGGTOCTGTG GGGGOGCCTG 1800 

GTTCATCTG6 GGCCAAAGGT GAGAGTGGTG ATCCAGGTCC TCAGGGCCCT CGAGGGGTCC 1660 

AGGGTCCCCC TGGTCCAACG GGAAAACCTG GAAAAAGGGG TCGTCCAGGT GCAGATGGAG 1920 

GAAGAGGAAT GCCAGGAGAA CCTGGGGCAA AGGGA6ATCG AGGGTTTGAT GGACTTCCGG 1980 

GTCTGCCAGG TGACAAAGGT CACAG6GGT6 AACGAGGTCC TCAAG6TCCT CCAGGTCCTC 2040 

CTGGTGATGA TGQAATGAGG GGAGAAGATO GAOAAATTGG ACCAAGAGGT CTTCGAOaTG 2100 

AAGCTGGCCC A0SA6GTTT6 CTG6GTCCAA GG8GAACTCC AGGA6CTCCA GGGCAGCCTG 2160 

GTAT6GCAG6 TGTAGATGGC CCCCCAGGAC CAAAAGGQAA CATGGGTCCC CAAGGGGAGC 2220 

CTGGGCCTCC AGGTCAACAA GGGAATCCAG GACCTCAGGG TCTTCCTGGT CCACAAGGTC 2280 

CAATTGGTCC TCCTQGTGAA AAAOGACCAC AAGGAAAACC AGGACTTGCT GGACTTCCTG 2340 

6TGCTGATG6 GCCTCCTGGT CATCCTGG6A AAGAAGQCCA GTCTG6AGAA AAGGGGGCTC 2400 

TGGGTCCCCC TGOTCCACAA QGTCCTATTG GATNNCCGGO CCOCCGGGGA GTAAAGGGAG 2460 

CAGATGGTGT CAGAGGTCTC AAGGGATCTA AAGGTGAAAA GGGTGAAGAT GGTTTTCCAG 2520 

GATTCAAAGG TGACATGGGT CTAAAAGGTG ACAGAGGAGA AGTTGGTCAA ATTGGCCCAA 2580 

GAGGGNAAGA TGGCCCTGAA 6GACCCAAAG GTCOAGCAGG CCCAACTGGA GACCCAG6TC 2640 

CTTCAGGTCA AGCAGGAGAA AAGGGAAAAC TTGGAGTTCC AQGATTACCA GQATATCCAG 2700 

GAAGACAAGG TCCAAAGGGT TCCACTGGAT TCCCTGGGTT TCCAGGTGCC AATaOAOAGA 2760 

AAGGTGCAGG GGGAGTAGCT QGCAAACCAG GCCCTCGGGG TCAGCGTGGT CCAAOGGGTC 2820 

CTCGAQGTTC AAGAGGTGCA AGAGGTCCCA CTGGQAAACC TGGGCCAAAG GGCACTTCAG 2880 

GTGGCGATGG CCCTCCTGGC CCTCCAGGTG AAAGAGGTCC TCAAGGACCT CAGGGTCCAG 2940 

TTGGATTCCC TGGACCAAAA GGCCCTCCTG GACCACCAGG AAGGATGGGC TGCCCAGGAC 3000 

ACCCTGGGCA AOGTQGGGAG ACTGGATTTC AAGGCAAGAC CGGCCCTCCT GGGCCA6GGG 3060 

GA6TGGTTGG ACCACAGGGA CCAACOGGTG AGACTGGTCC AATAGGGGAA OGTGGGTATC 3120 

CTGGTCCTCC TGGOCCTCCT GGTGASCAAO GTCTTCCTOG TGCTGCAGGA AAAGAAGGTG 3180 

CAAA GGGTGA TCCAGGTCCT CAAGGTATCT CA0G6AAAGA TGGACCAGCA GGATTACGTG 3240 

GTTTCCCAGG GGAAAGAGGT CTTCCTGGAG CTCAGGGTGC ACCTGGACTG AAAGGAGGGG 3300 

AAGGTCCCCA GGGCCCACCA GGTCCAGTTG GCTCACCAGG AGAACGTGGG TCAGCAGGTA 3360 

CAGCTGGCCC AATTGGTTTA OGAGGGCGCC OGGGACCTCA GQGTCCTCCT GGTCCAGCTG 3420 

GA6AGAAAGG TGCTCCTGGA GAAAAAGGTC CCCAAGGQCC TGCAGGQAGA GATG8AGTTC 3460 

AAGGTCCT6T TGGTCTCCCA GGGCCAGCTG GTCCTGCCGG CTCCCCTGG6 GAAGAC6(3U3 3540 

ACAAGGGTGA AATTGGTGAG CCGGQACAAA AAGGCAGCAA GGGT6GCAAQ GGAGAAAATG 3600 

6CCCTCCCGG TCCCCCAGGT CTTCAAGGAC CAGTTGGTGC CCCTGGAATT GCTGGAGGTG 3660 

ATGGTGAACC AGGTCCTAGA G6ACA0CAGG GGATGTTTGG GCAAAAAGGT GATGAGGGTG 3720 

CCAGABQCTT CCCTGGACCT CCTGGTCCAA TAGGTCTTCA GGGTCTGCCA GGCCCACCTG 3780 

OTQAAAAAOQ TGAAAATGG6 GATGTTGGTC CATGGGGGCC ACCTGGTCCT CCAGGCCCAA 3640 
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GAGGCCCTCA AGGTOCCAAT GQAGCTQATG GACX3VCAAGG 
CAGTTG6TGG T6TTGGA6AA AAGGGTQAAC CTGGAGAAGC 
GGGAAOCAOG TGTAGGOGGT CCCAAAGOAG AAAGAGGAGA 
CTGGAGCTGC TGQACCTCCA GGTGCCAAGG GGCCGCCAGO 
ACCCGGGTCC TGTTGGTTTT CCTGGAGATC CTGGTCCTCC 
GTCAAGATQO TGTTGGTGST GACAAGGCJTG AAGATGGAGA 
CTGGCCCATC TG6TGAGGCT GGCCCACCAG GTCCTCCTGG 
CTGCAGGTGC AGAGGGAAGA CftAGGTQAAA AAGGTGCTAA 
GTCCTCCTGG AAAAACCGGC CCAOTCGQTC CTCAGGGACC 
AAGGTCTTCG GGGCATCCCT GOTCCTQTQG GAGAACAAGG 
AAGATGGACC ACCTGGTCCT ATGGGACCTC CTGGCTTACC 
GCTCCAAG6G T<3AAAAGGGA CATCCTGGTT TAATTGGCCT 
AAGGG6AAAA AGGTGACOQA GGGCTCCCTG GAACTCAAGG 
ATGGGGQAAT TCCTGGTCCT GCTGGTCCCT TAGGTCCACC 
GTCCTCRAGO CCCAAAGGGT AACAAAGGCT CTACTGQACC 
GTGGTCTTCC AGGGCCTCCT GGGCCTCCAG GTCCACCTGG 
CAATCTTGTC CTCCAAAAAA ACGA6AAGAC ATACTQAAGG 
ATAATATTCT TGATTACTCG GATOGAATOO AAOAAATATT 
AACAAGACAT OGAGCATATG AAATTTCCAA TGOGTACTCA 
GTAAAGACCT GCAACTCAGC CATCCTQACT TCCCAGATGG 
ACCAAG6TTG CTCAGGAGAT TCCTTCAAAG TTTACTGTAA 
CTTGCATTTA TCCAGACAAA AAATCTGAGQ QAGTAAGAAT 
AACCAGQAAG TTGGTTTAGT GAATTTAAGA GGGGAAAACT 
AAGGAAATTC CATCAATAT6 GTQCAAATGA CATTCC TGAA 
GGCAAAATTT CACCTACCAC TGTCATCAOT CAGCA6CCIG 
GTTATGACAA AGCACTTCGC TTCCTGQQAT CAAATGATGA 
ATCCTTTTAT CAAAACACTG TATGATG6TT GTAOGTCCAG 
TCATTGAAAT CAATACACCA AAAATTGATC AAGTACCTAT 
ACTTTG6TQA TCAGAATCAG AAGTTC6GAT TTGAAGTTGG 
AAGATTAAGA CAAAGAACAT ATCAAATGAA CAGAAAATQT 
TTTTGTGCCA CATGCAAGTT TTGAATAAG6 ATGTATGGAA 
TACCATTTAO GAAATACOGA TGCCTTTQTG GGGGCAQAAT 
ATCATAAAGA TATAAGTTGG TGTGGCTAAG ATG6AAACAG 
TTCTCAACrC TCCTTTTCCT ATTTGAATTT CTTTGGTGCT 
AATATATATT CATAAAAAAT ATG6TGCTCA TTCTCATCCA 
TQTGTTTAAT AAATTGTAAT TATTTTGTGT ACAGTTCTAT 
CCAAAACTTG CACGTGTCCC TGAATTCOQC TGACTCTAAT 
GATGGCAATA ATATATGTAT TATGAAAATG AAGTTATGAT 
TTTCTTTGGT TAATGATGAA ATTCCTTTGr tf i V i'G rPT 

Seq ZD HOt 361 Protein sequence 
Protein Aeeesalon ft: NP_001845 
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KBPWSSRHKT 

CnHRKNSSSS 
EHGIQQIGVE 
DCXKKTTKPL 
PDCDSSAFRA 
NIVDDPQBVH 



AYGEKGQRGB 
GADGIiFGPPG 
QFVGGPGSSO 
GFDGLPGLPG 
GAPGQP(S4AG 
GLAGLPGADG 



11 

I 

KRWLflDPTVT 
DTAYRVSKQA 
VGRSPVFLPE 
DRSBRAIVDT 
AQAQEPQXDE 
YGTMESYQTE 
DLLVDGDLGE 
PAWEPQ1LV 
TMLMLFFRYG 
AKGBSCDFGP 



GLPGYPGSIQG 
GPKGTSGGDG 
GPPGFGGWG 
GPAGLRGFPG 
GPP6PA6EKG 



VIX3PPGPKGN 
PPGHPGKEGQ 
OMGXiKGDRGE 
PKGSTGFPGF 
PPGPPGBRGP 
PQGPTGBTQP 
ERGLPGAQGA 
APGEKGPQGP 
PPGLQGPVGA 
ENGDVGPWGP 



6NPGPPGEAG 
GELGPAGQDG 
6BAQAEGPPG 
QXiKSDPGSKG 
GPPGIiPGPQG 
KQADADDNIL 
EYWIDPNQGC 
IiSYLDVEQfS 
EMSYIHIKPFX 
PVCFXiG 



VGGDKGEDGD 
KTGPVGPQGP 
EKGHPGLIGL 
PKGNKGSTGP 
DY8DGMEEIF 
SGDSPKWCN 
INMVQMTFLK 
KTLYDGCTSR 



21 

I 

TLALTFLFQA 
QLSAPTKQLF 
DHTGKPAPED 
NGITVFGTRI 
YAPBDIIBYD 
APRHVSGTNB 
YDFYEYKEYE 
EGPPGPAGPA 
GDGSKGPTI8 
Q6FRGVQ6PP 
QGPPGPP6DD 
KGPQGEPGPP 
8GEKGAL6FP 
VGQI6FRGXD 
PGANGEKQAR 
QGPQGPV6FP 
IGERGYPGPP 
PGLKGGSGPQ 
AGRDGVQGPV 
PGIAGGDGBP 
FGFPGPRGPQ 
KGBAGPPGAA 
PGQPGPPGPS 
AOKPGPEGLR 
IGPPGEQGEK 
A6QKGDSGLP 
GSUfSLKQDI 
FTSQQETCIY 
LLTASARQNF 
KGYEKTVIEI 



31 

I 

REVSGAAFVD 

PGGTFPEDFS 
YPI.PRTVNIA 
LDEEVFBGDI 
YEYGEAEYKE 
PNPVEBIFTB 
DKPTSPPNEE 
GIKGPPGXjQG 
AQEAQAQAIL 
6PT6KPGKR6 



GQQCaiPGPQG 
QPQQPI6XPG 
GPSGPKORAG 
GVAGKPGPRG 
GPXGPP6PPG 
GPPGEQGIiPG 
GPPGPVGSPG 
GLPGPAGPAG 
GPR6QQGMF6 
6PMGADGPQG 
GPP6AKGPPG 
6BA6PPGPPG 
GIFGPVGEQG 
GDRGLPGTQG 
GPP6PPGPPG 
EHMKFPKGTQ 
PDKKSEGVRI 
TYHCKQSAAW 
MTFKIDQVPZ 



ACCCCCAGGT 
AOGAAACCCA 
GAAAGGGGAA 
TGATGATGGC 
TGGGGAACTT 
TCCTGGTCAA 
AAAACGAGGT 
GGGGGAAGCA 
TGCAGGAAAG 
TCTCCCTGGA 
TGGTCTCAAA 
GATTGGTCCT 
ATCTCCAGGA 
TGGTCCTCCA 
CGCTGGCCA6 
TGAAGTCATT 
CATGCAAGCA 
TGGTTCCCTC 
QACCAATCCA 
TOAATATTGG 
TTTCACATCT 
TTCATCATGG 
GCTTTCATAC 
ACTTCIGACT 
GTATGATGTG 
6GAGATGT0C 
AAAAGOCTAT 
TGTTGATGTC 
TCCTGTTTGT 
ACCTTGGT6C 
AACAA06CTG 
CACAGACAAA 
GGCTGATTCT 
GTAGAAAACA 
TCCAGGATGT 
ACTGTTATCT 
TTATQAGGAT 
TTCOGATGAC 



41 
I 

VLKAUaFHNS 
ZLFTVKPKKO 
DGKWERVAIS 
QQFLITGDPK 
ABSVTE6PTV 
EYLTGEDYDS 
FQPGVPABTD 
PTGPPGDPGD 
QQARIAIiRGP 
RPGADGGRGM 
PR6LPGEAGP 
LPGPQGPIGP 
PRGVKGADGV 
PTGDPGPSGQ 
QRGPTGPRGS 
RMGCPOIPGQ 
AA6KEGAKQD 
ERGSAGTAGP 



QK6DE6ARGF 
PFGSVGSVG6 
DDGPKGNPGF 
KRGPFGAAGA 
LPGAAGQD6P 
SPGAKGDGGI 
EVZQPLPILS 
TNPARTCKDIj 



YDV3SGSYDX 
VOVHISDF(a) 



TCTOntKSTT 
QGGCCTCCTG 
GCTGGTCCAC 
CCTAAGG6TA 
GGCCCTGCAO 
CCGGGTCCTC 
CCTCCTGGAG 
GGTGCAGAAG 
CCTGGTCCAG 
GCTGCAGOCC 
GGT6ACCCTG 
CCAGGAGAAC 
GCAAAAGGGG 
GGCTTACCAG 
AAAGGIGACA 
CAGCCTTTAC 
GATGCAGATG 
AATTCCCTGA 
GCCCGAACTT 
ATXGATCCTA 
GGTGGTQAGA 
CCAAAGQAGA 
TTAGATGTTG 
GCCTCTGCTC 
TCATCAGGAA 
TATQACAATA 
6AAAAAACTG 
ATGATCAGTG 
TTTCTTGGCT 
GAOSkACCCA 
CATATACAG6 
AGCTTTGAAA 
TGATTCCCAA 
AAAAAAGAAA 
ACTAAAACAG 
6TGTCCATTT 
6CG6AACTCT 
CCTAAfiTCCC 



51 
I 

PEGXSKTTGP 
ZQSFIiLSZYM 
VBKKTVTMZV 
AAYDYCEHYS 
TEETIAQTHA 
QRKKSEDTLY 
XTETSZK6H6 
RGPPGRP6LP 
PGFMGLT6RP 
PGEPGAKGDR 
RGUiGPRGTP 
PGEKGPQGKP 
RGLKGSKSEK 
AGEKOKLGVP 
RGARGPTOKP 
ROETGFQGKT 
PGPQ6XSGKD 
ZGLRGRPGPQ 
ZGEPGQKGSK 
PGPPGPIGLQ 
V6EKGEPGEA 
VGFPGDPGPP 
EGRQGEKQAK 
PGFM6PP6LP 
PGPAOPIiGPP 
SKKTRRETEG 
QLSHPDFPPG 
WF8EPKRGKL 
ALRFXiGSIIDE 
Q13QKFGFEVG 



3900 
3960 
4020 
40B0 
4140 
4200 
4260 
4320 
4380 
4440 
4500 
4560 
4620 
4680 
4740 
4800 
4860 
4920 
4980 
5040 
5100 
5160 
5220 
5280 
5340 
5400 
5460 
5520 
5580 
5640 
5700 
5760 
5820 
5880 
5940 
6000 
6060 
6130 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 



80 
85 



seq ZD NO: 362 miA sequence 
Nucleic Acid Accession «i NM_003107 
Coding sequence: 351-1775 

1 11 21 31 41 51 

I I 1 I i I 

TTCCCCAGCA TTCGAGAAAC TCCTCTCTAC TTTAQCACGG TCTCCAGACT CAOCCGAGAG 
ACAGCAAACT GCAGOGOGGT GAGAGAGCGA GAGAGAGGGA GAQAGAGACT CTCCAGCCTG 
GGAACTATAA CTCCTCTGCG AGAGGCGGAG AACTCCTTCC CCAAATCTTT TQGGGACTTT 



60 
120 
180 
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TCTCTCTTTA CCCACCTCCG CCCCTGCGAG GA0TTGAG66 6CCAGTTCGQ CCGCCQOGCG 240 

CGTCTTCCOG TTCX360GTGT GCTTGGCCC6 GG6AAC0Q66 ACGGCCCGGC GATCGCGCGG 300 

CC5GC0GC0GC GAGGQTQTQA GOQCGCQTQO OOGCCOQCCQ AGCOSAGGCC ATGGTGCAGC 360 

AAACCAACAA TGCCS3AGAAC ACXSGAAGOQC TGCTGGCCGG CGAGAGCTCG GACTCGGGCG 420 

CCGGCCTCGA GCTGGGAATC GCCTCCTCCC CCACGCCX3GG CTCCACCGCC TCCACGGGCG 480 

GCAAGGCC3GA C6ACC0GA6C TSGTGC3UVGA CCCCQAGTG6 GCACATCAAG CGACCCATQA 540 

A06CCTTCAT GGTGTGQTOQ CAGATCOAGC GGC6CAAGAT CATGGAGCA6 TCGCXTGACA 600 

TGCACAACX3C CGAGATCTCC AAGCGGCTGG GCAAAOGCTG GAAGCTGCTC AAAGACA6GQ 660 

ACAAGATCCC TTTCATTCGA GAGGCGGAGC GGCTG06CCT CAAGCACATG GCTGACTACC 720 

CCGACTACAA GTACOGGCCC AGGAAGAAGG TGAAGTCXX3G CAACGCCAAC TCCAGCTCCT 780 

GGGCC6CCGC CTCCTCCAAG CCGGGGGAGA AGGGAGACAA GGT0GGTG6C AGTG6GGGGG 840 

GOGGOCATQa GGGGGGOQOC GGGGOGGGGA GCAGCAACXX: GGGGGQAGGA G60Q6CGGT0 900 

CQA6TG6066 CGGCGCCAAC TCCAAACC6G C6CA6AAAAA GAGCTG0G6C TCCAAAGTQG 960 

G6GGGGGCGC GGGOGGTGGG GTTAGCAAAC OGCACGCCAA GCTCATCCTG GCAGOCGGCG 1020 

GOGGCGGCGG GAAAGCAGOG GCTGCCGCCG CC6CCTCCTT CGCCGCCGAA CAGGCGGGGG 1080 

CCGCCGCCCT GCTGCCCCTG GGCX3CXX3CXXS CC36ACCACCA CTC!GCTGTAC AAGGCGCGGA 1140 

CTCCCAGCGC CTGGGCCTCC GCCTCCTOGG CASCCTCXSGC CTC0QCA606 CTCGCGGCCC 1200 

CGGGCAAGCA CCTGGOGGAG AAGAAGOIOA AOCX5C3QTCTA CCTOTTCGGC GGCCTGGGCA 1260 

CGTCGTCGTC GCCCGTGQGC GGCGTGGGCG C3QGGAGCCX3A CCCCAGCXSHC CCCCTGGGCC 1320 

TGTACXsAGGA GGAGGGCXSCG GGCTGCTCGC CCGACGCGCC CAGCCTGAGC GGCCGCAGCA 1380 

GOGCCGCXrrC GTCCCCCQCC GCCGGCCXSCT CGCCCGCCGA CCACCGCGGC TACQCCAGCC 1440 

TGCGCGCOGC CTCXSCCOQCC CCGTCCAGCG CGCCCTCGCA CGCGTCCTCC TCX3GCCTC6T 1500 

CCCACTCCTC CTCTTCCrCC TCCTCGGGCT CCTCGTCCTC CX3ACGACGAG TTGGAAGACG 1560 

ACCTGCTCGA CXTTGAACCCC AGCTCAAACT TTGAGAGCAT GTCCCTGGGC AGCTTCAGTT 1620 

OSTCSTCGGC 6CT0GAC0GG 6ACCTGGATT TTAACTTOSA GCCCGGCTCC GGCTOGCACT 1680 

T06AGTTCCC 6GACTACTGC ACGCCCGAGG TGAGOQAGAT GATCTCGGGA GACTGGCTCG 1740 

AGTCCAGCAT CTCCAACCTG GTTTTCACCT ACTGAAGGGC GCGCAQGCAG GGAGAAGGGC 1800 

OGGGGGGGGT AGGA6AGGAG AAAAAAAAAG TQAAAAAAA6 AAAOQAAAAG GACAGAOGAA 1860 

GAGTTTAAA6 A6AAAAGGGA AAAAAGAAA6 AAAAAGTAAG CAGGGCTCGT TCGCCCGCGT 1920 

TCTCGTCGTC GGATCAAGQA G060G60QGC GrTTTTGQAOC GGCGCTCCCA TCCCCXACCT 1980 

TCCCGGGCCG GGGACCCACT CTGCCCAGCC G6AGGGACGC 6GAGGAGGAA GAGGGTAGAC 2040 

AGGGGCGACC TGTGATTGTT GTTATTGATG TTGTTGTTGA TGGCAAAAAA AAAAAGCX3AC 2100 

TTCX5AGTTTG CTCCCXTTTTG CTTGAAQAGA CCCCCICCCC CTTCCAACGA GCTTCCGGAC 2160 

TTQTCTGCAC CCCCA6CAAG AAGGCGAGTT AGTTTTCTAG AGACTTGAAG GAGTCTCCCC 2220 

CTTCCTGCAT CftCCACCTTO GTTTTCTTTT ATTTTGCTTC TTGGTCAAGA AAGGAGGGQA 2280 

GAACCCAGOG CACCOCTCCC CCCCTTTTTT TAAACGOGTG ATGAAGACAO AAGGCTCOGQ 2340 

GGTQAOGAAT TTGGCCGATG GCAGATGTTT TGGGGGAACG CCGGGACTGA QAGACTCCAC 2400 

GCAGGCGAAT TCCCGTTTGG GGCCTTTTTT TCCTCCCTCT TTTCXXJCTTG CCCCCTCTGC 2460 

AGCCGGAGGA GGAGATGTTG AGGGGA6GAG GCCAGCCAGT 6TGAGCGG0G CTAGGAAATG 2520 

ACCCX3A6AAC GCCGTTGGAA G00CAGCA6C GGQAGCTAGG GGCGG6GG08 GA66AGGACA 2580 

CQAACTGGAA 66GGGTTCAC GQTCAAACtG AAATGGATTT GCACXSTTGOQ GAGCTGGOGG 2640 

CGGCGGCTGC TGGGCCTCCG CCTTCTTTTC TACOTGAAAT CAGTQAGGTG AGACTTCCCA 2700 

6ACCCCGGAG GCGTGQAGGA GAGGAGACTG TTTGATGTGG TACAGGGGCA GTCAGTGGAG 2760 
GGCGAGTGGT TTCGGAAAAA AAAAAAGAAA AAAA6GG 



Seq ID NO: 363 Protein Sequence 
Protein Accession #i NP_003098 

1 11 21 31 41 51 

i I I' I I i 

NVQQTtlNA^ TBALLAGESS DSGAGLELGI ASSPTPGSTA STG6KAD0PS WCKTPSGKZK 60 
RPMNAFMVHS QIERRRZMBQ 8PDNHNABIS KRLGKRHXLL KDSDKIPFIR EAERLRLKHM 120 
ADYPDYKyRP RKKVXStaiAN SSSSAAASSK PGEXGDKVGG SGGGGHGGGG G6GSSNAGGG 180 
GGGASGGOAH SKPAQXKSCG SKVAGGAGQG VSKPHAKLIIi AGGGGGGKAA AAAAASFAAE 240 
OAGAAALLPL GAAADHHSLY KARTPSASAS ASSAASASAA LAAPGKHLAB KKVKRVYIiFG 300 
GLGTSSSPVG GVGA6ADPSD PLGBYEEEGA GCSPDAPSLS GRSSAASSPA AGRSPADHRG 360 
YASLRAASFA PSSAPSHASS SASSHS5SSS 5SGSSSSDDB FEDDLLDLHP SSNFBSMSLG 420 
SFSSSSAIiDR OIiDFNFEPGS OSHPBFPinrC TPEVSEMI6G DttLBSSISNL VFTY 



Seq ID NO: 3S4 DNA sequence 
Nucleic Acid Accession #: U10860 
Coding sequences 123-2204 

1 11 21 31 41 51 

1)1111 

TGCCGGCTGC TCCTCGACCA GGCCTCCTTC TCAACCTCA6 C0C6OGG0GC CGACCCTTCC 60 

GGCACCCTCC CX5CCCCGTCT CGTACTGTC30 COOTCACOQC C30CGGCTCOQ GCCCT0GC3CC 120 

CX5ATGGCTCT GTGCAACGGA GACTCCftAGC TGGAGAATGC TGGAGGAQAC CTTAAGGAT6 180 

GCCACCACCA CTATGAAGGA GCTGTTGTCA TTCTGGATGC TGGTGCTCAG TACGGGAAAG 240 

TCATAOACXXS AAGAGT6A6G GAACTGTTCG TGCAGTCTGA AATTTTCCCC TTGGAAACAC 300 

CAQCATTTGC TATAAAGQAA CAAGGATTCC GTGCTATTAT CATCTCTGQA GGACCTAATT 360 

CTGTGTATGC TGAAGATGCT CCCTGGTTTG ATCCAGCAAT ATTCACTATT GGCAAGCCT6 420 

TTCTTGGAAT TTGCTATGGT ATGCAGATGA TGAATAAGQT ATTTGGAGGT ACTGT6CACA 480 

AAAAAAGTGT CAGAGAAGAT GGAGTTTTCA ACATTAGTGT GQATAATACA TGTTCATTAT 540 

TCAGGGGCCT TCAGAAGGAA GAAGTTGTTT TGCTTACACA TGGAGATAGT GTAGACAAAG 600 

TAGCTGATGG ATTCAAGGTT 6TGGCACGTT CTGOAAACRT AGTAGCAGGC ATAGCAAATG 660 

AATCTAAAAA GTTATATGGA GCACAGTTCC ACCCTGAAQT T6GCCTTACA GAAAATGGAA 720 

AAGTAATACT 6AAGAATTTC CTTTATGATA TAGCTGGATG CAGTGGAACC TTCACCGTGC 780 

AGAACAGAGA ACTTGAGTGT ATTC6AGAGA TOUUMSAGAG AGTAGGCACG TCAAAAGTTT 840 

TGGTTTTACT CAGTG6TGQA GTAOACTCAA CAOTTTGTAC AQCTTTGCTA AATCGTOCTT 900 

TGAACCAAGA ACAAGTCATT GCTGTGCACA TTGATAATGG CTTTATGAGA AAAGGAGAAA 960 

GCCAGT CTGT TGAAGAGGCC CTCAAAAAGC TTGGAATTCA GGTCAAAGTG ATAAATGCTG 1020 

CTCATTCTTT CTACAATGGA ACAACAACCC TACCAATATC AGATGAAGAT AGAACCCCAC 1080 

QGAAAAGAAT TAGCAAAACG TTAAATATGA CCACAA6TCC TGAAGAGAAA AGAAAAATCA 1140 

TTGG GGATAC T TTTG TTAAG ATTGGCAATG AAGTAATTGG AGAAATGAAC TTGAAACCAG 1200 

AGGAGGTTTT CCTTGCCCAA GGTACTTTAC GQCCTGATCT AATTGAAAGT 6CATCCCTTG 1260 
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TTGCAAGTGO CAAAGCTGAA CTCATCAAAA CCCATCACAA TQACHCAGAG CTCATCAGAA 1320 

AGTTQAQAGR. GGA6GQAAAA GTAATAGAAC CTCTGAAAGA TTTTCATAAA GATGAAGTGA 1380 

GAATTTTGGG CAGAGAACTT GGACTTCCAG AAGAGTTAGT TTCCftGQCAT CCATTTCCAG 1440 

GTCCTGGCCT GGCAATCAGA GTAATATGTQ CTOAAGAACC TTATATTTGT AAGGACTTTC 1500 

CTGAAACXSWi CAATATTTTG AAAATAGTAG CTGATTTTTC TGCAAGTGTT AAAAAGCCAC 1560 

ATACCCTATT ACAQAGAGTC AAA6CCTGCA CAACAGAAGA GGATCAGGAG AAGCTGATGC 1620 

AAATTACCAO TCTGCATTCA CTOAATGCCT TCTTGCTGCC AATTAAAACT GTAGOTGTGC 1680 

AGGGTGACTG TGOTTCCTAC AGTTACSror GTGGAATCTC CAQTAAAGAT GAACCTQACT 1740 

GGQAATCACT TATTTTTCTG GCTAGGCTTA TACCT06CAT GTGTCACAAC GTTA ACAG AG 1800 

TTGTTTATAT ATTTGGCCCA CCAGTTAAAG AACCTCCTAC AGATGTTACT CCCACTTTCT 1860 

TCACAACAGG GGTGCTCAGT ACTTTAOGCC AAGCTGATTT TGAGGCCCAT AACATTCTCA 1920 

GGGAOTCIGS GTATGCTGGO AAAATCAGOC AOATGOOGGT GATTTTGACA CCATTACATT 1980 

TTGATOGGGA COCACTTCAA AAGCAQCCTT CATGCCAGA6 ATCTGTGGTT ATTCQAACCT 2040 

ITATTACrAG TGACTTCATG ACTGOTATAC CTGCAACACC TGGCAATGAG ATCCCTGTAQ 2100 

AGGTGGTATT AAAGATGGTC ACTGAGATTA AGAAGATTCC TGGTATTTCT OSAATTATOT 2160 
ATGACTTAAC ATCAAAGCCX: CX3VGGAACTA CTGAGTGGGA GTAATAAACT TC 

8eq ID NOt 365 Protein sequence 
Protein Accession I: AAA60331 

1 11 21 ' 31 41 51 

I I I I I I 

MALOIGDSKL ENAGGDLKDG HHHYEQAWI LDAGAQYGKV IDRRVRELPV QSEIPPLETP 60 

AFAIKEQGFR AHISGGPNS VYAEDAPWFD PAIFTXGKPV LGICYGMQMM NKVFGGTVHK 120 

KSVREDGVFN ISVDNTCSLP RGLQKEEWL LTHGDSVDKV AOGFKWASS QHIVAOXAMB IBO 

SKKLYGAQPH PEVGLTENGK VILKNFLYDI AGCSGTPTVQ NRBLECIREZ KERVOTSKVL 240 

VLLSGOVDST VCTALIjNRAL NQECJVIAVHI DNGFMRKRES QSVEEALKKL GIQVKVINAA 300 

HSFYNGTTTL PISDEDRTPR KRISKTLMMT TSPBBKRKII QDTFVKIANE VIGEMNLKPB 360 

BVFLAQGTLR PDLIESASIiV ASGKAELIKT HHMDTBLIRK LREEGKVZEP LKDFH KDEVR 420 

IL6RELGLFB ELVSRHPPPG PGIAIRVICA BBPYICXDPP ETNNILiaVA DFSASVKIgH 480 

TLLQRVKACT TEEDQEKLKQ ITSLHSLKAF LLPZKTVGVQ GDCRSYSYVC GISSKDEPDH 540 

BSLZPIiARLI PRMCHNVNRV VYIFOPPVKE PPTDVTPTFL TTGVLSTLRQ ADPEAHmiiR 600 

ESGYAGKISQ MPVILTPLHF DRDPLQKQPS CQRSWIRTP ZTSDFMTQIP ATPGNBIFVE 660 
WLIQifVTEZK KIFGISRIMY DLTSKPPGTT EWE 



Seq ID NO: 366 ONA sequence 
Nucleic Acid Accession #t IIM_004219 
Coding sequence t 46-654 

1 11 21 

I I I 

GOGGCCTCAG ATGAATGCGG CTGTTAAGAC 
TATGTTGATA AGGAAAATGG AGAACCAGGC 
CiaGaOTCTG GACCTTCAAT CAAA6CCTTA 
TTTGGCAAAA 06TTC6ATGC CCCACCAGCC 
ACTGTCAACA GAGCTACAGA AAAGTCTGTA 
CCAAGCTTTT CTGCCAAAAA GATGACTGAQ 
6CCTCAGATG ATGCCTATCC AOAAATASAA 
GA6ASTTTTG ACCTGCCTGA AOAGCACCAG 
CTCATGATCC TT6ACGAG6A GAGAGA6CTT 
CCTGTGAAGA TGCCCTCTCC ACCATGGGAA 
CTGTCGACCC TGGATGTTGA ATTGCCACCT 
TAGTGCrrCA GAGTTTGTGT GTATTTGTAT 
AAAAAAAA 

Seq ID NOt 367 Protein sequence 
Protein Accesaion #t irPj004210 

1 11 21 31 41 51 

I I I I 1 I 

MATLIYVDKB NGEP6TRWA KDGLKLGSGP 8IKALDGRSQ VSTPRFGKTF DAPPALPKAT 60 
RKALGTVNRA TBKSVKTKGP LKQKQPSFSA KKMTEKTVKA KESVPASDDA YPEIEKFFPP 120 
NPU)FESFDL FEEHQIABLP LSGVPLMILD EERELEKLFQ L6PPSPVXMP SPFHESmjLQ 180 
SP8SILSTLD VBLPPVCCDI DI 



Seq ID NO: 368 DNA sequence 
Nucleic Acid Accession #: NM_000597 
Ooding sequence: llB-1104 

I 11 

1 I 
ATTCGGGGCG AGGGAGQAGQ 
CCTGCCCGCC 0GCCO6CTCG 
CTGCCGAGAG TQ6GCTGCCC 
COGCTGCTGC TGCTGCTACT 
CTGTTCCGCT GCCCGCCCTG 
GCGCCGCCCG CCGCGGTGGC 
GTCCX3GGAGC CGGGCTGCGG 
GGCGTCTACA CXXXXSOGCTG 
CTGCCCCTGC AG6CGCTGGT 
TATGGCGCCA GGCOQQAOCA 
GTGGAGAACC AOGTGGACAQ 
AA6CCCCTCA AGTOGGGTAT 
CACCGGCAGA TGGGCAAGGG 
C6ACCACCCC CTGCCAGGAC 



31 41 51 

! i 1 

CTGCAATAAT CCAGAATGGC TACTCTGATC 60 

ACCCX3T0TGG TTGCTAAGGA TGGGCTGAAG 120 

GATGGQASAT CTCAAGTTTC AACAGCA03T 180 

TTAGCTAAA6 CTACTAGAAA GGCTTTGGGA 240 

AAGACCAAGG GACCCCTCAA ACAAAAACAG 300 

AAGACTGTTA AAGCAAAAAG CrCTGT TCCT 360 

AAATTCTTTC CCTTCAATCC TCTAGACTTT 420 

ATTGG6CA0C TCCGCTTGAO TQQAaxaCCT 460 

GAAAAGCTGT TTCA6CTGG6 OCOGCCTTCA 540 

TCCAATCTGT TGCA6TCTCC TTCAAGCATT 600 

GTTTGCTGTG ACATAGATAT TTAAATTTCT 660 

TAATAAAGCA TTCTTCAACA GAAAAAAAAA 720 



21 31 41 51 

1 I I I 

AAGAAGOGGA GGAGGCGGCT CCCGCTCGCA GGGCCGTGCA 60 

CTOSCTCGCC CGCCGOSCCG CGCTGCCGAC CGCCAGCATG 120 

OQCGCTGCCG CTGCCGCCGC CGCCGCTQCT GCCGCTGCTG 180 

GGGC6CGA6T GGCGGC6GC6 GCQOGGCGCQ 0Q0GGAG6TG 240 

CACAOCGGAG OOOCTGOCGO CCTG06GG0C CCCX3C066TT 300 

OGCAGTGGCC GGAGGCGCCC GCAT6CCATG G6GGGA6CTC 360 

CTGCT6CTCG GTGTGCGCCC GGCTGGAGGG CGAQQOOTGC 420 

CGGCCAGQGG CTGCGCTGCT ATOXCACCC GGGCTCCGAG 480 

CATGGGC9GAG GGCACTTGTG AGAAG06CC6 GGAGGCOQAG 540 

68TTGCA6AC AATG60GATG AOCACTCAQft A6GAGGCCT6 600 

CACCATGAAC ATGTTGG60G GGGGAGGCAO TGCTGQOCQG 660 

GAA6GAGCTG GCOGTGTTCC GGGAGAAGGT CACTGAGCAG 720 

TGGCAAGCAT CACCTTGGCC TGGAGGAGCC CAAGAAGCTG 780 

TCCCTGCCAA CAGGAACTGG ACCAGGTCCT GGAGCGGATC 840 
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ACCTCTACIC 
GCAAGATGTTC 
AGCTGATCCA 
AGCAGCAG6A 
GCCTGGCGCC 
GTGGTGGGTG 
A6CACX!GAGC 
CCTTCTTGCT 
GG6GAG6GG6 
AOGAAGOAAA 



CCTGCACATC 
TCTGAACGGG 
6GC3AOCCOCC 
GGCTTGCGGG 

ccTGccxrcc 

CTGQAGGATT 
TG66CACCTC 
TTCCCCGGGG 
AAGftSAAATT 
AGT 
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TCCACCATGC GCCTTCCGGA TQAGCGOGCSC CCTCTGGAGC 
CCCAACTGT6 ACAAGCATGQ CCTQTACAAC CTCAAACAGT 
CAGCGTGGGG AOTGCTGQTG TGTQAACCCC AACACXX3GGA 
ACCATCCQGG GGQACCCCGA QTGTCATCTC TTCTACAATG 
OT6CACRCCC AGCGOATGCA GTAGACOGCA GCCAGCC6GT 
GCCCCTCTCC AAACACCGGC AGAAAACGGA GAGTGCTTGG 
TTCCAGTTCT OACACACGTA TTTATATTTG GAAAGAGACC 
CCCGGCCTCT CTCTTCCCAG CTGCAGATGC CACACCTGCT 
GAGGAAGGGG GTTGTG6TCG G6GAGCT6GG GTACAG6TTT 
TTTATTTTTG AACOCCTGWG TCCCTTTTOC ATAAQATTAA 

Seq ID KO: 369 Protein sequence 

Protein AcceBsion #: HP_000588 s 

1 11 21 . 31 41 51 

I 1 I I ! I 

MLPRVGCPAL PLPPPPLLPL LPIiLLLUiGA SGGGGGARAE VLPRCPPCTP BRLAAC6PPP 
VAPPAAVAAV AGGARMPCAB LVRBPGCGCC SVCRRLEGEA CXSVYTPRCGQ GLRCYPHPGS 
ELPLQAIiVHG EQTCEKRROA BYGASPBQVA DNGDDHSEGG LVENHVDSTM NMLGGGGSAG 
RKPIiKSGHKB lAVFREKVTB QHRQMGKGOK HBLQliBBPKK UEtPPPARTPC QQEIiDg|VI£a 
ISTMRLPDER GPLBHLYSLH IPHCDKHGLY HLKQCKMSUI GQRSBCWCVN PNTGKLIQGA 
PTIRGDPECH LPVNEQQEAC GVHTQRMQ 

Seq ID HO: 370 DNA sequence 
llUcleic Acid AceeBsion #t KM_004264 
Coding sequence: 6-440 



-GGAACATGGC 
TTTGTAATGC 
AGACAGCAAT 
CAGCACTGAT 
AAGAATCIAC 
AAGCTGCTAC 
AAAGCGCACT 
AGTCTCTTCC 
fStQCQKPlML 
TTAAACACTA 
GATAAGCTTA 
6AGTGAAATT 
AATTCTGTTA 
C 



11 
I 

GGATCGGCTC 
CATTGGAGTA 
TAACAAAGAC 
TGCACGAACA 
ASCTGCTTTA 
AT6TQTGGAG 
TGCTGATATT 
AGACTCATA6 
CAAtTCTGCA 
TGACACATTA 
TAAAXCATGA 
ATTAAGGCAT 
TGACATAATT 



21 

! 

AGGCAGCrTC 
TTGCAGCAAT 
CAGCCAGCTA 
GCAAAAGACA 
CAGGCTGCTA 
GATGTTGTTT 
GCACAGTCAC 
CATCA6TGGA 
TCAGACTTAG 
CCTTTTTAGC 
TTGAATCAGC 
GTAATACATT 
TATGTCTCCA 



31 
I 

AGQACGCTGT 
GTGGTCCTCC 
ACCCTACAGA 
TTGATGTTTT 
GCTTGTATAA 
ATC6AGGAGA 
AGCTGAAGAC 
TACCATGTGG 
ATACAAGCCT 
TATTTTTAAT 
TTTAAAGCAT 
AATGAACATA 
TTTTGTTGTA 



41 
I 

GAATTCX3CTT 
TGCCTCTTTC 
AGAGTATGCC 
GATAGATTCC 
GCTAGAAGAA 
CATGCTTCTG 
AAGAAGTGGT 
CTGAGAAAAG 
TACCAACAAT 
AGTCTTCTAT 
CATACCATCA 
ATATAAGGAA 
TTGGCCAGTA 



51 
I 

GCAGATCAiST 
AAXAATATTC 
CAGCTTTTTG 
TTACCCAGTG 
GAAAACCATG 
GAGAAGATAC 
ACCCATAGOC 
AACTGTTTGA 
TACAGAAACA 
TTTCACTCTT 
TTTTTTAACT 
ACATATGTAA 
CTTTTACAAT 



Seq ID MO: 371 Protein sequence 
protein Accession 4t MP 004255 



41 



51 



1 11 21 31 

111)1) 
MADRLTQLQD AVNSLADQFC NAIGVLQQC3G PPASFNNIQT AINRDQPANP TEEYAQI.FAA 
LIARTAKDID VLIDSLPSBE STAALQAASIt YKLEEBNHEA ATCVEDWYS GDNLIiEKIQS 
ALADXAQSQL KTRSGTHSQS LPDS 

seq ID mi 372 DNA sequence 
Iffucleic Acid Accession #: AJ271091 
Coding sequence: 1-1113 



AT6GAGAATC 
CTGGGCGT6G 
CATTTCAAAG 
TTCTTAGACC 
ACAGTACAGA 
CTGTTTTTGG 
AOAGCTAAGO 
ACTCTTACAA 
TTCTCCTGGA 
TATGACACAT 
GAAACTATCA 
CTTCTTGGAA 
AAAGCTGTGG 
TTCTACATGC 
CTGTGGATTC 
ATTCCAATAT 
AAAGTTAGAT 
ATAAATTTTC 
CATGCCTGTG 



11 

) 

AGGTGTTGAC 
AGCT6AGTGA 
CTCAAGGACA 
TTGTC5AAACC 
AGAAAGTGAG 
CTCCTQACTT 
AAGAASAaCG 
ACTTAAGGAA 
TCTTTGTCAA 
TCCATACTGT 
AT6CAGCAAT 
GAAATTTTAT 
TTTTCTTTGT 
TGACGTGCAT 
CCTTATATCC 
TCAATGAGAC 
TTTCCTTTTT 
GTCAOCTTTA 
ATCCCAGOGC 



21 

I 

GCCGCATGTC 
CGTACAGAAC 
TGGTGCXAAA 
AGAGCCTGTT 
TCAGTGGTGG 
TGATCGTTGG 
CCTAAATAAA 
AGGATACCTG 
CCTGACTGTG 
G6CTGACATG 
TGGAG7CACT 
TTTGTTTATC 
GTTTTATTTG 
TGACATGGAT 
ACTGGGATGT 
CGGAOGATTC 
TCTTCAGATT 
TAAACAGCGC 
TTTGGGAGGC 



31 
) 

TACTGGGCTC 
CCTGCCATCA 
GGAGACAAT6 
TACAAACTQA 
GAGAGACTCA 
CTGGATGAAT 
CTCOGACTGG 
TTTATGTATA 
CGATTCTGTA 
ATGTATTTCT 
AC6TCACCGG 
ATCTTTG6CA 
TGGAGT6CAA 
TGGAAGGTGC 
TTGGOGGAAG 
A8TTTCACAT 
TATCTTATAA 
AGACTGAAAA 
TGA 



41 

I 

AQCGACAC06 
GCATCACTQA 
TCTATGAATT 
CCCAGAGGCA 
CAAA6CAGGA 
CTGATGCGGA 
AAAGCGAAGG 
ATCTTGTGCA 
TCTT6GGAAA 
GCCAGATGCT 
TGCTGCCTTC 
CCATGSAAOA 
TTQAAATTTT 
TCACATGGCT 
CTGTCTCAGT 
TGCCATATCC 
TGATATTTTT 
T6AGGGCA0G 



51 

) 

OGAGCTATAT 
AAACGTGCTG 
TCACCTGGAG 
GGTAAACATT 
AAAGCGACCA 
AATGGAGCTC 
CTCTCCTGAA 
ATTCTTGGGA 
AGAGTCCTTT 
GGCAGTTGTG 
TCTGATCCAG 
AATGCAGAAC 
CAGGTACTCT 
TCGTTACACT 
GATTCAGTCC 
A GTGAAA ATC 
AG8TTTATAC 
CGCAGTGGCT 



Seq ID NO: 373 Protein sequence 
Protein Accession CAB69070 



11 



21 



31 



41 

I 



51 



900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
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60 
120 
160 
240 
300 



60 
120 
IBO 
240 
300 
360 
420 
460 
540 
600 
660 
720 
780 



60 
120 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 



MENQVLTPKV YWAQRHREIiY liRVEIiSDVQir PAISITENVL HPKAQGHGAK GDNVYEPHLE 
PL DliVKP EPV YKLTQRQVNZ TVQKKVSgWH ERI.TKQEKRP LPIjAPDFDRH LDESDAEMEL 
RAKEEEEUHK LRLESBQSPS TVIKhRXGYL FMYNLVOFLG FSWIFVNLTV RFCIL6KBSF 



60 
120 
160 
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TOTFHTVADM MYFCQMLAW ETINAAIGVT TSPVLPSLIQ LliGRNPlLFI IPQTMEEKQN 240 
KAWFFVPyL HSAIBIFRYS FYMLTCIDND fnCVLTWUtYT IMIPLYPLGC lAEAVSVZQS 300 
IPIFMETGRF SFTLPyPVKI KVRFSFFLQI YLIMIFIiGLy XNFBBLYKQR RLKKRAGUVA 360 
BACDPSALOO 



Seq ZD HO: 374 SNA Beguence 
laucleic Acid Accession #t llMj016395 
Coding sequence > 1-1113 

1 11 21 31 41 51 

I I i I I I 

ATGOAGAATC AGOTOTTGAC GCCGCATGTC TACTGGGCTC AOCGACACCXS CGAGCTATAT 60 

CTGOGCGTGO AGCTGAGTGA OGTACAGAAC CCTGCCATCA GCATCACTGA AAACX3T6CTG 120 

CATTTCAAAG CTCAAGGACA TGGTGCCAAA GGAQACAATG TCTATGAATT TCACCTGGAO 180 

TTCTTAOACC TTGTGAAACC AGAGCCTGTT TACAAACTGA CCCAGAGGCA GGTAAACATT 240 

ACACSTACAOA AGAAAGTGAG TCAGTGGTGO GAGA6ACTGA CAAAGCAGGA AAAGCGACCA 300 

CTCTTTTTGG CTCCTGACTT TGATCGTTGG CTGGATGAAT CTGATGCGGA AATGGAGCTC 360 

AGAGCTAAGG AAGAAGAGCG CCTAAATAAA CTCCGACTGG AAAGCGAAGG CT CTCCT GRA 420 

ACTCTTACAA ACTTAAGGAA AGGATACCTG TTTATGTATA ATCTTGTGC3V ATTCTT SGQA 480 

TTCTCCTGGA TCTTTGTCAA CCTGACTGTG C3GATTCTGTA TCTTGOGAAA AOAGTCCTTT 540 

TATGACACAT TCCATACTGT GGCTGACATG ATGTATTTCT GCCAGATGCT GGCAGTTGTG 600 

GAAACTATCA ATGCAGCAAT TGGAGTCACT AOGTCACCGG TGCTGCCTTC TCTGATCCAG 660 

CTTCTTGGAA GAAATTTTAT TTTGTTTATC ATCTTTGGCA CCATG GAAGA AATGCAGAAC 720 

AAAGCTGTGQ TTTTCTTTGT GTTTTATTTG TGGAGTGCAA TTGAAATTTT CAGGTACTCT 780 

TTCIACATGC TGAOGTGCAT TGACATGGAT TGGAAGGTGC TCACATGQCT TCGTTACACT 840 

CT6TG6ATTC CCTTATATCC ACTGGGATGT TTGGCGGAA6 CTGTCTCAGT GATTCAGTCC 900 

ATTCCAATAT TCAATGAGAC CGGAOGATTC AGTTTCACAT TGCCATATCC AG TGAA AATC Sf60 

AAAGTTAGAT TTTCCTTTTT TCTTCAGATT TATCTTATAA TGATATTTTT AGGTTTATAC 1020 

ATAAATTTTC GTCACCTTTA TAAA CAGOGC AGACXtSAAAA TGftGOGCRflQ OGCAfiTCGCT 1080 
CATGCCTGTG ATCOCROOQC TTTGQGAGGC TGA 



Seq ID NO I 375 Protein sequence 
Protein Accession #: NP_0S7479 

1 11 21 31 41 51 

I I I t I I 

MBNQVLTPHV YUAQRHRELY LRVELSDVQN PAISITQm« HFKAQGKGAK GDMVYEFHLE 60 

FLDLVKPBPV YKLTQRQVNI TVQKKVSQKW ERLTKQEKRP LFIAPDFDRH LDBSDAENEIt 120 

RAKBEERIiNK IiRIiESEGSPE TLINLRK6YL FMYNLVQFLG FSWIFVNLTV RFCIUSKESF 180 

YDTPHTVADM MYFCQMLAW ETIKAAIGVT TSPVLPSLIQ LL6RMPILPI IFGTMBEMQN 240 

KAWFFVFYL WSAIBIPRYS PYMLTCIDMD WKVLTWLRYT LWIPLYPLGC LVEAVSVIQS 300 

IPIPNETQRP SFTLPYPVKI KVRFSPPLQl YLIMIFLGLY INFBHLYHQR RRRYGKKRKR 360 
STKXKDLDGF LPV 

Seq ID NO: 376 DNA sequence 
Nucleic Acid Accession NM_005987 
Coding sequence: 1-270 

I 11 21 

I I I 

ATGAATTCTC AGCAGCAGAA GCAGCCTTGC 
6T6AAACAAC CTTGCCAGCC TCCACCCCAG 
TQGCAACCCA AGGT6CCTGA 60CCTGCCAC 
ATTCCASAOC CCTGCCAGCC CAAGGTGCCT 
CCAGCCCA6C AGAAGACCAA GCASAAGTAA 

Seq ID NO: 377 Protein sequence 
Protein Accession #i NP_00S978 

1 11 21 31 41 51 

I I I I I I 

MNSQQQKQPC TPPPQPQQQQ VKQPCQPPPQ EPCIPKTKEP OQPKVPBPCH PKVPBPOQPK 
IPEPCQPKVP EPCPSTVTPA PAQQKTKQK 



Seq ID NO: 378 DNA sequence 
Nucleic Acid Accession #t NM_002105 



31 41 51 

I I I 

CTAOTGrTTG AGCCGT06T6 CTTCACOGOT 60 

GACTOGCGGC AAGGCCXX5CG CCAAOQCCAA 120 

CCCAGTGGGC CGTGTACACC GGCTGCTGC33 180 

OQGCGOGCCA GTGTACCTG6 CGGCA6TGCT 240 

GGCGGGCAAT GGGGCCGQCXS ACAACAAGAA 300 

GGCCATC06C AAOQAOQAQG AGCTCAACAA 360 

AGGCGTCCTG CXICAACATCC AGGCCX5TGCT 420 

GCCGAAGGCX3 CCCTCGGGC3G GCAAGAAGGC 4 BO 

CCCGCXSCCX3C GGCCGGCCGC CCCAGCTCCC 540 

ACCACOGCrC TCATGGAAAG AGCTGA6CGG 600 

TCCCTTOCCC TCCOCTCCCC TC6CCCGCCT 660 

CCOGCTCCCG TCCOGCACaS CXnX3CCGC!GT 720 

CCCTCCGGTA GGGTTCGGGC CTTCCGGATO 780 

6CX3CGGAAGA CCCXSAGCCT6 COGGGGGGAG 840 



31 41 51 

I I I 

ACCXCACCCC CTCAGCCTCA GCAGCAGCAG 60 

GAACCATGCA TCCCCAAAAC CAAGGAGCCC 120 

CCCAAAGTGC CIGA6GCCTG CCAGCCCAAO 180 

GA6CCCIG0C CTTCAAaSGT CACTCCAGCA 240 



coding sequence: 74-505 

1 11 21 

I I 1 

ACAGCAQTTA CACTGCGGCG GGCGTCTGTT 
CTACCTCGCT AGCATGTOGG GCCGCGGCAA 
GTCGC6CTCG TCGCGCGCCG GCCTCCAGTT 
GAAGGGCCAC TAOGCOGAGC GCGTTGGCGC 
GQAGTACCTC ACasCTGAQA TCCTGGAflCT 
GACGCGAATC ATCCCCCGCC ACCTGCAGCT 
GCTGCTGGGC GOCGTGACQA TCGCCCAGGG 
GCTGCCCAAQ AAGACCAGCG CCACCGTGGG 
CACCCA6GCC TCCCAGGAGT ACTAAGAGGG 
CATQCCACCA CAAAGGCCCT TTTAAGGGCC 
CTTCAQACTQ 06GG6CAA8C GGGCOGOGGC 
TCGCOGCCCG GCCTCQAGTC CCCGCCCGCC 
OGGCCTCGGG CCTGCCCTGT CCGCCGTCCO 
0G6CTTG0GC GCTCTTCGGG GACCTCOGTG 
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PCTAJS02/12476 



5 

10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



GCCGGGGGCG 
GCTAA6G06C 
CAGGGCOGAG 
CGGCAGCTGC 
AGACGGCCGC 
GCCCCTTCTG 
AQGTCIGOGC 
CT6CCTCCTA 
CTATGTGGAC 
CCGACX3CC6C 
C/IGCACAAGT 
CTGCAGCTAA 
TTTATTAAAG 



CCX3CACCTGC 
TGGG66QAGG 
GTGGGC3U3TC 
AGCCGGQGTG 
TGGCCGG6AG 
CGGCCGGGAC 
TGGGGCOGGG 
G6AG6ACATT 
AGCAAOAGTC 
CCCATTTCCC 
OGGTTAATCC 
CXXITTOCAGG 
QATTGTTTTT 



CCGCCT06GC 
CG6CA6CACC 
CA6GCGGA6A 
TCTGGTACCC 
GCTTTGGTGG 
CGAGGCCTTT 
AOSAAQCACT 
TAGGGGAGGG 
GTTTTGCGGA 
TTCCAGCAAA 
CTGTCTGGAC 
ACTAGAACCT 
TTTTT 



GTTCX3T6ACr 
TTCTGGAAGA 
GC0GG08GGC 
CCCXGGCGTG 
GAGAGACGCG 
CACATCAGCT 
TGOTAAOUaG 
CAGAGGCCTG 
ACGC6ACTGG 
CTCAACTCGG 
TGA6CCTCCG 
TAGGCATTGG 



CAGCCX3CCCC 
CTTGGCCTTC 
CTGAAGGTGA 
GTGCTTAGCC 
ATCGCCGATT 
CTCCCTCCAT 
CACATCTTCC 
CAGTTTGGCT 
CAGCCAGGCC 
CAATCCAAQC 
TTGGCTTCTG 
GGAGTTTTAG 



ATCCCGRGTC 
OGCTCTGACG 
GTGAGGCCCT 
CAGGACTTTC 
TCX3GTCTGGC 
CTTCATTCAT 
TCCCGAQTQA 
TCACGGCTGG 
TGTCGGGCCC 
ACCTAGATAC 
AACTGQAATT 
ATG6ACTAAT 



Seq ID NO: 379 Protein sequence 
Protein Accession #t NP_002096 



51 



1 
I 

ACGOGTCCGG 
GGGCGAGCGG 
CA6GCGAGGC 
GOQATGAAGA 
CATGTCC36CA 
CTGTTGATTT 
CTGGGAGGTG 
CTTCTCATGA 
TGGATCATCC 
ATCACTGTQC 
TTTCCCTACA 
CTGTTTATTA 
TACOGATACA 
ACTACX3GTQC 
CCGCCACCTT 
CTTTGCAGAC 
TTGTTTGTTG 
TCAACATATG 
TGGGGATATA 
OGACCTAGAA 
TCOCTCTCTT 
CCAGCATAGA 
ATTAGGTAAA 
TGACTGAAGT 
TAAGACCATT 
GATCTTGTGT 
GTGTTTGGCQ 
GGCCCGCTTT 
TATCAAGTGG 
TCTTTTCCCT 
TAAAATGTAA 
GACTACCTGA 
TGTTGGTTCA 
CCACATCCAA 



11 
I 

CAGAAGCTCG 
GCCGGGAGCC 
GGT0GACX3CT 
TGGTCGOGCC 
COGGCAOCAT 
TATTGAGT6C 
ACTTT6A6TT 
TCCTGATATG 
CATTCTTCTG 
TTATTTATCC 
GA6AT6AT6T 
GCATTATCTT 
TCAATGGTAG 
TGCTACCCCC 
ACGTGTCTGC 
ATCT6AGCAA 
CTGAAATGCT 
CTTTGCTAGA 
AC66GCTTCA 
OTCTGCTTTT 
TTGAAAATGT 
GAACAAAACC 
TAGAAGTCCT 
TCAATGACAG 
AGAAAGCACC 
GCAGGGACAT 
CTGCATGGGA 
TACTAAGTGT 
AATTGGGATA 
GCAAGCTACA 
ACATTTTCAG 
ATTGCAAQGG 
TTATTGAATG 
AAAAAAAAAA 




TCTGCCCTAG 
TATTTGATAT 
TCCTACTGCT 
AAAAATGA6G 
ATTTTTATAT 
T6CTGTAAAT 
AAAAA 



41 

1 

AGGCA6GCCC 
AGCAGCGGCG 
CTCGCGCCAC 
ACA6CTGCTG 
TGATCATCAA 
ATAACTTTTC 
GCATTGCCAT 
CGTACAAGCA 
CCCTGAACAT 
TAOGG CRACT 
GTTTGGTCCT 
TTAGCTGTGT 
TTTATGTTAC 
ATGGTGCTGC 
A6CTGACGGC 
TGCCATGAGC 
TAGATT6AAA 
TAOAATTCTT 
AAACTTCCCC 
TTGGGCATTT 
CAACTTTTTC 
ATTGT6TAAT 
TTCCCCX31CA 
ATTTTCTCCA 
ATCTACTGAC 
TGTTAGAGGG 
GGATTCACAT 
GGAGGTCATC 
AACAACATGG 
AAGTATGTCT 
TTGTATGCGC 
TACAAAGTCA 
GCAATTAAAA 



51 
1 

G0GGG0GC31C 
CGGCGGGCTC 
TGCGCCCQGA 
CTTGTGCTGC 
TQCTQTGGTA 
7U\GTTCTGAA 
TGCGATTTCT 
ACGCGCAGCC 
GTTGGTTGCA 
GCCTCCTAAT 
TATTATTCTT 
TTGGAACTGC 
CAGCAATGAC 
CAAGGAGCCA 
AGCAGCTTffi^ 
CTCTCTQAGC 
ACTGTA6TTT 
CCTGTAC6AT 
CAAATCTGAT 
TTCTCTCTGT 
TTCAGCCATT 
CATT6TTCTA 
ACATCCTTTA 
TGGCCTGAAT 
TGTTCTT6TG 
TGGAATGGAT 
CCCCACCCA6 
CAACTGACTT 
AAAAGGOTTT 
AGTCACCTTT 
TTTTTACCTT 
GCAACrCTCC 
CAAGGTTTGC 



Seq ID NO; 381 Frotcin sequence 
Protein Accession #: CAB6£87fi 

I 11 



41 



51 



21 31 

I I I 1 I 1 

MKMVAPMTRP YSNSCCLCCH VRTGTILLGV WYLIINAWL I»ILLSALADP DQYNFSSSEL 
GGDFEFMDDA NMCIAIAISL I<MILICAMAT YGAYKQRAAW IIPFPCYQXF DFALNMLVAI 
TVIiIYPirSIQ SyiRQIiPRNF PYRDDVMSVN PTCLVLIILL FISIILTFKG YLISCVMNCy 
RYINGRNSSD VLVYVTSNDT TVLXiPFYDDA TVNQAAKEPP PPYVSA 

Seq ID NO: 3B2 DNA sequence 
Nucleic Acid Accession ft: NM_002510 
Coding sequence t 92^1774 



1 
I 

CAGATGCCA6 
CCTTGAQTGC 
TCTGCTCCTG 
CAATGAAAGA 
TGAAAATGAC 
AAACTCCTGG 



11 
I 

AAGAACACTG 
CTGCGTCCGT 
GCTGCAAGAT 
CCTTCTGCTT 
TQGAATGAAA 
AAGGGAGGCC 



21 

I 

TTGCTCTTGG 
6AGAATTCAG 
TGCCACTTGA 
ACATGAGGGA 
AACTCTACCC 
<3TGT6CAGGC 



31 
I 

TGGAOGGGCC 
CATGGAATGT 
TGCCGCCAAA 
6CACAATCAA 
AGTGTGGAA& 
GGTGCTGACC 



41 

1 

CAGAGGAATT 
CTCTACTATT 
CQATTTCATG 
TTAAA7GGCT 
GGGGGAGJ^ 
AGIGACTCAC 



SI 

1 

CAGAGTTAAA 
TCCTGGGATT 
ATGTGCTGGG 
GGTCTTCTGA 
TGAGGTGGAA 
C3U3CCCTCGT 



900 
960 
1020 
1060 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 



1 11 21 31 41 

I 1 I I I I 

MSGRGKTGGK ARAKAKSRSS RAGLQPPVGR VHRLLRKGHY AERVGAGAPV YLAAVLEYLT 
AEILBLAGNA ARDNKKTRII PRHLQZ«AIKN DEBItNKLLGG VTIAQGGVLP NIQAVZiLPKK 
TSATVGPKAP SGGKKATQAS QEY 



Seq ID KOt 380 DNA sequence 
Nucleic Acid Accession AL136942 
Coding sequence: 164-664 



60 
120 



60 
120 
180 
240 
300 
360 
420 
4B0 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 



60 
120 
180 



60 
120 
180 
240 
300 
360 
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G6GCTCAAAT ATAACATTTG CGGTGAACCT GATATTCCCT AGATGCCAAA AGOAAGATGC 420 

CAATGOCSUU: ATRCTCTATG AOAAGAACTG CAGAAATGA6 GCTGGTTTAT CTGCTOATCC 480 

ATATGTTTAC AACTGGACRG CATGGTCAGA GGACAGTGAC GGOQAAAATG GCACOGGCCA 540 

AAGCCATCAT AACGTCTTCC CTGATGGGAA ACCTTTTCCT CACCACCCCX5 GATGGAGAAG 600 

ATGGAATTTC ATCTACX3TCT TCCACACACT TGGTCAQTAT TTCCAGAAAT TGGGAOGATG 660 

TTCAtSTGAGA GTTTCTGTGA ACACAGCCAA TGTGACACTT GGGCCTCAAC TCATGGAAGT 720 

GACTGTCTAC AGAAGACATG OACXMGCATA TGTTCCCATC GCACAAGTGA AAGATGTX5TA 780 

CQTGGTAACA GATCAOATTC CTOTOTTTQT QACTATGTTC CAGAAGAAOG ATCGAAATTC 840 

ATCCGACGAA ACCTTCXTTCA AAOATCTCCC CATTATGTTT GATGTCCTGA TTCATQATCC 900 

TAGCCACTTC CTCAATTATT CTACCATTAA CTAC3UU3TGG AGCTTOQGGG ATAATACT6G 960 

CCTGTTTGTT TCCACCAATC ATACTGTGAA TCACAOSTAT GTGCTCAATG GAAOCTTC3W3 1020 

CCTTAACCTC ACTOTOAAAG CTGCAGCACC AGGACCTTGT CCGCCACCQC CACCACCACC 1080 

CAGACCTTOV AAACCCACCC CTTCTTTAGG ACCTGCTGGT GACAACCCCC TGGAGCTQAG 1140 

.TAGGATTCCT GATGAAAACT GCCAGATTAA CAGATATGGC CACTTKaAG CCACCATCAC 1200 

AATTGTAGAO GGAATCTTAG AGGTTAACAT CATCCA6ATG ACAGACGTCC T6ATGG08GT 1260 

GCCATG6CCT GAAAGCTCCC TAATAGACTT TGTCGTGACC TGCGAAGGGA GCATTCCCAC 1320 

GQAGGTCTGT ACCATCATTT CT6ACCCCAC CTQCX3AQATC ACCCAGAACA CAGTCT6CAG 1380 

CCCT6TGGAT GTGGATGAGA TOTOTCTGCT GACTGTGAGA OGAACCTTCA ATGGGTCTGG 1440 

GAOGTACTCT GTGAACCTCA CCCTGGGGGA TQACACAA6C CTGGCTCTCA OGA GCACC CT 1500 

GATTTCT6TT CCTGACAGAG ACCCA6CCTC GCCTTTAAGG ATGGCAAACA GTGCCCTGAT 1560 

CTCCGrrGGC TGCTTGGCCA TATTTGTCAC TGTGATCTOC CTCTTGGTGT ACAAAAAACA 1620 

CAAGGAATAC AACCCAATAG AAAATAGTCC TGGGAATQTG GTCA6AAGCA AAGQCCTQAG 1680 

TGTCrTTCTC AACOGTGCAA AAGCCGTGTT CTTCCCGGGA AACCAGGAAA AGGATCGGCT 1740 

ACTCAAAAAC CAAOAATTTA AAGGAGTTTC TTAAATTTCX3 AOCT TGTTTC T6AA0CTCAC 1800 

TTTTCAGTGC CAriGATGTG AGAT G TGCTG GA6T6GCTAT TAACCrTTTT TTCCTAAAGA I860 

TTATTOTTAA ATAQATATTG TGGTTTGGGG AAGTTGAATT TTTTATAGGT TAAATGTCAT 1920 

TTTAGA6ATG GGGAGAGGGA TTATACTGCA GGCAGCTTCA GCCATGTTGT GAAACTGATA 1980 

AAAGCAACTT AGCAAGGCTT CTTTTCATTA TTTTTTATGT TTCACTTATA AAGTCTTAGG 2040 

TAACTAGTAG GATAGAAACA CTGTGTCCCG AGAQTAAGGA GAGAAGCTAC TATTGA TTAG 2100 

AGCCTAACCC AGGTTAACTG CAAGAAGAOO OGGQATACTT TCAGCTTTCC ATGTAACTGT 2160 

ATGCATAAAG CCAATGTAGT CCAGTTTCTA AGATCATGTT CCAAGCTAAC TGAATCCCAC 2220 

TTCAATACAC ACTCATGAAC TCXTOATGGA ACAATAACA6 GCCCAAGCCT GTGG TATGA T 2280 

GTGCACACTT GCTAGACTCA GAAAAAATAC TACTCTCATA AATGGGTGGG AGTATTTTGG 2340 

TGACAACCTA CTTTOCTTGO CTGAGTGAAG GAATGATATT CATATATTCA T TTATTC CAT 2400 

GGACATTTftG TTAGTOCTTT TTATATACCA GGCftTQATGC lOAGIGACAC TCTTOTGTAT 2460 

ATTTCCAAAT TTTTGTATAO TCXSCTGCACA TATTTGAAAT CATATATTAA OACTTTOOIA 2520 

AGATGA6GTC CCTOGTTTTT CATGGCAACT TGATCAGTAA GGATTTCACC TCTGTTTGTA 2580 

ACTAAAACCA TCTACTATAT GTTAGACATG ACATTCTTTT TCTCTCCTTC CTGAAAAATA 2640 
AAGTGTQG6A AGAGACAAAA AAAAAAAAA 

Seq ID NOt 383 Protein sequence 
Protein Accession ft; np_002S01 

I 11 21 31 41 51 

I I I I 1 I 

MBCLyyFLGF LIiLAARLPLD AAKRFKDVL6 NERPSAYHRE BNQLNGWSSD ENDHNBKLYP 60 

VWKRSJMRWK NSWKGGRVQA VLTSDSPALV GSNITFAVNL IPPROQKEDA NGNIVYEIQIC 120 

RNEAGLSADP YWNWTAWSE DSDGENGTGQ SHHNVFPDGK PPPHHPGWRR WNPIYVFHTL 180 

QQYPQKLGRC SVRVSVNTAN VTLGPQLMEV XVYRRHGHAY VPIAQ VKPyY WTDQIPVFV 240 

TMFQKMDRNS SDETFUODIfP IMFDVLXHDP SHFLNYSTIN YKWSPGDNTG LFVSTNHTVN 300 

BTyVLNGTFS IiNZiTVKAAAP GPCPPPPPPP RP8KPTPSLG PAGDNFLBLS RIPDENCQIK 360 

RYGHF^ATIT IVEGILBVNI IQMTDVIUPV PWPESSLIDF WTCQGSIPT BVCTIISDPT 420 

CBITQNTVCS PVDVDEMCLIi TVRRTENGSG TYCVMLTLGD DTSLALTSTL ISVPDRDPAS 480 

PIiRHANSALI SVGCLAIPVT VISLLVYKKH KBVHPIEUSP GMWRSKGLS VPLNRAKAVP 540 
FPGNQEKDPL LKNQBFKGVS 

Seq ID NO: 384 tfSA sequence 
Nucleic Acid Accession #: NM_00li34 
Coding sequence: 48-1877 

1 11 21 31 41 51 

I I I I I 1 

TCCATATTQT GCTTCCACCA CTGCCAATAA CAAAATAACT A6CAACCAT6 AAGTGGGTGG 60 

AATCAATTTT TTTAATTTTC CTACTAAATT TTACTOAATC CAGAACACTG CATAGAAATG 120 

AATATGGAAT A6CTTCCATA TTGQATTCTT ACCAATGTAC TGCAGAGATA AGTTTAGCTG 180 

AOCTGGCTAC CATATTTTTT GCCCAOTTTG TTCAAGAAQC OVCTTACAAG GAAGTAAGCA 240 

AAATGGTGAA AGATGCATTG ACTGCAATTG A6AAACCCAC TGGAGATGAA CAGTCTTCAG 300 

GGTGTTTAGA AAACCAGCTA CCTGCCTTTC TGGAAGAACT TTGOCATGAG AAA6A AATTT 360 

TGGAGAAGTA CGGACATTCA GACTGCTGCA GCCAAAGTGA AGAGGGAAGA CATAACTQTT 420 

TTCTT6CACA CAAAAAGCCC ACTCCAGCAT CGATCOCACT TTTCCAAGTT CCAGAACCTG 480 

TCACAAGCTG TGAAGCATAT QAAGAAGACA GGGAGACATT CATGAACAAA TTCATTTATG 540 

AGATAGCAAG AAGGCATCCC TTCCTGTATG CACCTACAAT TCTTCTTTGG OCT6CTCX3CT 600 

ATGACAAAAT AATTCCATCT TGCT6CAAA0 CTGAAAAT6C AGTTGAATGC TTCCAAACAA 660 

AGGCAGCAAC AGTTACAAAA GAATTAAGAG AAAGCAGCTT GTTAAATCAA CAT6CATGTG 720 

CAGTAATGAA AAATTTTGGG ACCCGAACTT TCCAAGCCAT AACTGTTACT AAACTGAGTC 780 

AQAAGTTTAC CAAAGTTAAT TTTACTGAAA TCCAGAAACT AGTCCTGGAT GTGGCCCATG 840 

TACATGAGCA CTGTTGCAGA GGAGATGTGC TGQATTGTCT OCAGGATGGG GAAAAAATCA 900 

TGTOCTACAT AT6TTCTCAA CAAGACACTC T6TCAAACAA AATAACAGAA TGCTGCAAAC 960 

TGACCACGCT GGAACGTGGT CAATGTATAA TTCATGCAGA AAATGATGAA AAACC TGAAG 1020 

GTCTATCTCC AAATCTAAAC AGGTTTTTAG GAGATAGAGA TTTTAACCAA TTTTCTTCAG 1080 

GGGAAAAAAA TATCTTCTTG GCAAGTTTTG TTCATGAATA TTCAAGAAQA CATCCTCAGC 1140 

TTGCTGTCTC A6TAATTCTA A6AGTTGCTA AAGGATACCA GGAGTTATPG QA6AAGTGTT 1200 

TCCA6ACTGA AAACXXTTCTT OAATGCCAAO ATAAAG6AGA AGAAGAATTA CAGAAATACA 1260 

TGCAGGAGAO CCAAGCATTG GCAAAGCX3AA GCTGGGGCCT CTTCGAGAAA CTAGGAGAAT 1320 

ATTACTTACA AAATGCGTTT CTCGTTGCTT ACACAAAGAA AGCCCCCCAG CTGACCTCGT 1380 

CGGAGCTGAT GGCCATCACX: AGAAAAATGG CAGCCyVCAGC AGCCACTTGT TGCCAACTCA 1440 

GTGAGGACAA ACTATTGGCC TGTGGCGAGG GA60GGCTGA CATTATTATC GGACACTTAT 1500 
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GTATCAGACA TQAAATGACT CCAOTAAACC CTOGTGTTGG CCAGTGCTGC ACTTCTTCAT 1560 

ATGOCAACAG GRQ6CCAT0C TTCaGCSVGCT TQGT0C3TGGA TGAAACATAT GTCCCTCCTG 1620 

CATTCTCTGA TC3ACRAGTTC ATTTTCCaTA AGGATCTGTG CCAAGCTCAG GGTGTAGCGC 1680 

TGCAAACGAT QAAGCAAGAG TTTCTCATTA ACCTTGTGAA GCAAAAGCCA CAAATAACAG 1740 

A6GAACAACT TGAGGCTGTC ATTGCAGATT TCTCAGGCCT GTTGGAGAAA TGCTGCCAAG 1800 

GCC AGGAA CA G GAAG TCTCC TTTGCTGAAG AGGGACAiU^ ACT6ATTTCA AAAACTCX3TG 1860 

CTGCTTTQGQ AQT TTAAA TT ACTTCAGGGG AAGAQAAGAC AAAACGAGTC TTTCATTCGG 1920 

TGTXSAACTTT TCTCTTTAAT TTTAACTGAT TTAACaCTTT TTGTOAATTA ATGAAAMAT 1980 
AAAQACTTTT ATGTGAGATT TCCTTATCAC AGAAATAAAA TATCTCCAAA TG 



Seq ID MO: 385 Protein sequence 
Probein Accession #i HP_001125 



60 
120 
180 
240 
3O0 
360 
420 
460 
540 
600 



1 11 21 31 41 51 

I I i I i I 

MKWVESIPLI PLLNPTESaT LHRNBYGIAS ILDSYQCTAB ISIADLATIF PAQFVQEATY 
KEV8K^4Vla3A LTAIEKPTCro- EQSSGOiBNQ LPAFLBBLCE SKBILBK»a! SDCCSQSEEO 
RHNCPLAHKK PTPASIPLFQ VPBPVTSCBA YEEDRETPMN KPIYBIAHRH PPLYAPTILI* 
WAARYDKIIP SCCKAEWAVE CFQTKAATVT KELBESSLhS QHACAVMKNP GTRTFQAITV 
TKLSQKPTKV MPTBIQKLVL DVAHVHEHCC RGDVLDCLQD GEKIMSYICS QQDTLSNKIT 
BCCKLTTLER GQCIIHAEND EKPEGLSPNL NRFLGDRDFN QFSSGEKNIP LASFVHBYSR 
RHPQLAVSVI LRVAKGYQBL LEKCFQTENP LEOQDKGEBB LQKYIQBSOA LAXRSCGLPQ 
KLGEYYLQNA FLVAYTKKAP QLTSSBLMAI TRKMAATAAT CCQLSEDKLL AOSBQAADIl 
IQHLCIRHEM TPVNPGVGQC CTSSYANRRP CPSSLWDET YVPPAPSZaJK PIEHKDLCQA 
QGVALQTMKQ EPLINLVKQK PQITSEQIiBA VIADPSGItljE KCOQGQBQEV CPABBGQKLI 
SKTRAALGV 

Seq ID NO: 386 DMA sequence 

Nucleic Acid Accession #: NH_002205.1 

Coding sequence! 1..3149 



1 11 21 31 41 51 

I ) . I • I I ! 

ATGGGGA6CC GGACXSCCAGA GTCCCCTCTC CACGCCGTGC AGCTGCGCTG QGGCCCCCGG 60 
OGCCGAOCCC CGCTSSTGCC GCTOCTGTTG CTOCTSSTGC CGCOGCCACC CAGGGTGGGG 120 
QGCTTCAACr TA6A0G0GGA GGCX:CX3«3CA GTACTCT006 G6CCCCCGGG CTCCTTCPTC 180 
GGATTCTCAG TGGAGTTTTA CCGGCCGG6A ACAGACGGGG TCAQTGTQCT GGTGGGAGCA 240 
CCCAAGGCTA ATACCAGCCA GCCAGGAGT6 CTGCAGGGTG GTGCTGTCTA CCTCTGTCCT 300 
TGGGGTGCCA GCCXKACACA GTCCACCCCC ATTGAATTTG ACAGCftAAGG CTCTOGGCTC 360 
CTGGA6TCCT CACTGTCCAG CTCAGAGGGA GAGOAGGCTa TGGAOTACAA GTCCTTGCAG 420 
TGGTTCGGGG CAACAGTTCG AGCCCATGGC TCCTCCATCT T6GCATGGGC TOCACTGTAC 480 
AGCTGGC6CA CAGAGAAGGA OCCACTGAGC GACCCCGTGG GCACCTGCTA CCTCTCCACA 540 
QATAACTTCA CCCGAATTCT GGAGTATGCA CCCTGCOGCT CAGATTTCA6 CTQGGCAQCA 600 
GGACAGGGTT ACTGCCAAGG AGGCTTCAGT GCCGAGTTCA CCAAGACTGG CCGTCTGGTT 660 
TTAGGTGGAC CA6GAAGCTA TTTCTGGCAA GGCCAGATCC TGTCTGCCAC TGAOQAOCAG 720 
ATTGCAGAAT CTTATTACCC C6AGTACCTG ATCRACCTGG TTCA6GGGCA GCTGCAQACT 780 
CGCCAG6CCA GTTCCATCTA TQATGACAGC TACCTAGGAT ACTCTOTGGC TGTTOGTGAA 840 
TTCAOrOGTa ATGACACAGA AaACTTTGTT GCTGGTGTGC CCAAAGGQAA CCTCACTTAC 900 
GGCTATGTCA CCATCCTT/Ul TGGCTCAGAC ATTCGATOCC TCTACAACTT CTCAGGGGAA 960 
CAGATGGCCT CCTACTTT66 CTATGCAGTG GCCGCCACAG ACGTCAATOG GGAOSGGCrG 1020 
GATGACTT6C TGGTGGGGGC ACCCCTGCTC ATOQATCXSGA CCCCTQACGG GCGGCCTCAG 1080 
GAGGTGGGCA GGGTCTACGT CTACCTGCAG CACCCAGCC!G GCATA6AGCC CACGCCCACC 1140 
CTTAOCCTCA CTGGCCATGA TGAGTTTGGC GGATTTGGCA GCTCCTTGAC CGOCCTGGGG 1200 
GACCTGGACC AOQATOGCTA CAATSATQTO GCCATCQGGG CTCCCTTTOG TGGGGAGACC 1260 
CAGCA6GGAG TA6TGTTTQT ATTTCCTGGG GGCCCAGGAG GGCTGGGCTC TAAGCCTTCC 1320 
CAGGTTCTGC AQCCCCTGTG GGCAGCCAGC CaCACCCCAG ACTTCTTTGG CTCTGCCCTT 1380 
CGAGGAQ6CC GAGACCTGGA TGGCAATGGA TATCCTGATC T6ATTGTGGG GTCCTTTGGT 1440 
GTGGACAAGG CTOrOGTATA CAGGGOCOSC CCCATCGTQT COOCTAOTGC CTCCCTCAC3C 1500 
ATCTTCCCCG CCATGTTCAA CCCAGAGGAG CGGAGCTGCA GCTTAGAGGG GAACCCTGTG 1560 
GCCTGCA'TCA ACCTTAGCTT CTGCCTCAAT 6CTTCTGGAA AACAC3GTTGC TQACTCCATT 1620 
GGTTTCACAG TGGAACTTCA GCTGGACTGG CAGAAGCAGA AGGGAGGGGT AOGGCGGGCA 1680 
CTGTTCCTGG CCTCCAGGCA GGCAACXZCTG ACCCAGACCC TGCTCATCCA GAATOGGGCT 1740 
CGAGAGGATT GCAGAGAGAT GAAGATCTAC CTCAGGAAOG AGTCAGAATT TCGAGACAAA 1800 
CTCTCGCCXSA TTCACATCGC TCTCAACTTC TCCTTQGAOC CCCAAGCCCC AGTG6ACAGC 1860 
CACGGCCTCA GGCCAGCCCT ACATTATCAG A6CAAGAGCC GGATAQAGGA CAAGGCTCAG 1920 
ATCTTGCTGQ ACTOrGGAOA AQACAACaTC TGTGTGCCTG ACCTGCAGCT G6AAGTGTTT 1980 
GGOGAGCAGA ACCATGTOTA CCTGGGTGAC AAQAATGCCC TGAACCTCAC TTTCCATGCC 2040 
CAGAATGTGG GTGAGGGTGG CGCCTATGA6 GCTGAGCTTC GGGTCACXX3C CCCTCCAGAG 2100 
GCTGAGTACT CAGGACTCX3T CAGACACCCA GGGAACTTCT CCAGCCTGAG CTGTQACTAC 2160 
TTT6CCGTGA ACCAGA6C06 CCTGCTGGTG ITGXGAGCTGG GCAACOXAT GAAGGCAGGA 2220 
G0Catf 3TCTGT GGQGTGGCCT TOOGTTTAC» GTCCCTCSWC TCOSGGACAC TAAGAAAACC 2280 
ATCCAGTTTG ACTTCCAGAT CXTTCAGCAAG AATCTCAACA ACTCGCAAAO OGACGTGGTT 2340 
TCCTTTCGGC TCTCCGTGGA GGCTCAGGCC CAGGTCACCC TGAACGGTGT CTCCAAGCCT 2400 
GAGGCAGTGC TATTCCCAGT AAGGGACTGG CATCCCCX3AG ACCAGCCTCA GAAGQAGGAG 2460 
6ACCTGGGAC CTGCTGTCCA CCATGTCTAT GAGCTCATCA ACCAAGGCCC CAGCTCCATT 2520 
AGCCAGGGTG TGCTOSAACT CftGCTGTCCC CAGGCTCTGG AAGGICAGCA GCTCCTATAT 2580 
GTGACCAGAG TTACGGGACT CAACTGCACC ACCAATCACC CCATTAAOCC AAAGGGCCTG 2640 
GAGTOXSQATC COGAGGGTTC CCTGCACCAC CAGCAAAAAC GGGAAGCTCC AAGCCGCAGC 2700 
TCTGCTTCCT CGGGACCTCA GATCCTGAAA TGCCCGGAGG CTGAGTGTTT CAGGCTGCGC 2760 
TGTGAGCTCG GGCCCCTGCA CCAACAAGAG AGCCAAAGTC TGCA6TTGCA TTTCCGAGTC 2820 
TGGGCCAAGA CTTTCTTGCA GCGGGAGCAC CAOCCSUTPTA GOCTGCAGTG TOAGGCTaTG 2680 
TACAAAGCCC TGAAGATGCC CTACCGAATC CTGCCTCGGC AGCTGCCCCA AAAAGAGCGT 2940 
GAGGTGGGCA CAGCTQTOCA ATG6ACCAAG GCA6AAGGGA GCTATGGOGT CCCACTGTGG 3000 
ATCATCATCq TAGCCATCCT GTTTGGCCTC CTGCTCCTAG GTCTACTCAT CTACATCCTC 3060 
TACAAOCTTG QATTCTTCAA ACGCTCCCTC CCATATGGCA CCGCCATGGA AAAAGCTCAG 3120 
CTCAAGOCTC CAGCCaCCTC TGATGCCTGA 
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Seq ID NO: 387 Protein sequence 
Protein Accession ft* NP_002 196.1 

5 1 11- 21 31 41 51 

I t I I I I 

MGSRTPESPL KAVQLRHQPR RRPPLLPLLL LLLPPPPRVG GFKLDAEAPA VLS6PPGSFF 60 

GFSVEFYRPG TDGVSVLVGA PKAMTSQPSV LQGGAVYLCP N6ASPTQCTP lEFDSKGSRL 120 

I.ESSLSSSB6 ERPVEYKSLQ WPGATVRAHG SSILACAPLY SWRTEKBPLS DPVGTOfLST 180 

10 DNFTRILEYA PCRSDFSWAA GQGYCQGGFS AEFTKTGRW LGGPGSYFWQ GQILSATQBQ 240 

lAESYYPEYI* *INLVQGQLQT RQASSIYDDS YLGYSVAVGE FSGDDTEDFV AGVPKQNLTY 300 

GYVnUJGSD IRSLVNFSGB QKASYFGYAV AATDVNGDGL DDLLVGAPLL MDRTPDGRPQ 360 

BVGRVyVyiiQ HPAGIEFTPT LTLTGHDBFG RFGSSLTPLG DLDQDGVNDV AXGAPFGGET 420 

QQGWFVFPG GPGGIiGSKPS QVLQPLWAAS HTPDFFGSAL RGGRDLDGMQ YP DLIV GSFG 480 

IS VDKAWYRGR PIVSASASLT IPPAMFNPEE VLSCShEGNFV ACIKLSFCLH ASGKBVADSI 540 

GFTVBLQLDW QKQRGGVRRA LPLASRQATIi TQTLLIQNGA REDCREMKIY LRMESBFRDK 600 

LSPIHIALNP SLDPQAPVDS HGLRPALHYQ SKSRIEDKAQ ILLDCGEDNI CVPDLQLBVF, 660 

GBQHHVYLGD KNALNLTPHA QNVQEGGAYE AELRVTAPPE AEYSGLVRHP GNFSSLSCDY 720 

PAVNQSRLLV CDLGNPMKAG ASLWGGLRPT VPHLRDTKKT IQFDPQIIiSK NLNNSQSDW 780 

20 SFRI>SVBAQA QVTLKGVSKP BAVZiFPVSDir HPHSQFQKEB DLGPAVKHVY ELINQGPSSI 840 

SQGVLBLSCP QALBGQQiiLY VTHVTGUTCT TNHPINPKGL ELDPBGSIiHH QQKREAPSRS 900 

SASS6FQILK CPBAECFRLR GELGPLHQQE SQSLQZ£FRV WAKTFIiQREH QPFSLQCEAV 960 

YKALKMPYRI LPRQLPQKER OVATAVQWTK AEGSYGVPIM IIILAILPQL LLLGLiZYIL 1020 
YKLGFFKRSL PYGTAMEKAQ LKPFATSDA 



25 



65 



80 



Seq' ID NO: 388 DNA sequence 
Nucleic Acid Accession ftt NM^002425 
Coding sequence: 26.. 1453 



30 1 11 21 31 41 51 

1 I I I I I 

AAAGAAG6TA AGG6CAGTGA GAATGAT6CA TCTTGCATTC CTTX5TGCTGT TGTGTCTGOC 60 

AGTCTGCTCT GCCTATCCTC TGAGTGGGOC AGCAAAAQAQ GAGGACTCCA ACAAGQATCT 120 

TGCXICAGCAA TACCTAGAAA AGTACTACAA CCTOGAAAAG GATGT6AAAC AGTTT AGAAG 180 

35 AAAOGACAGT AATCTCATT6 TTAAAAAAAT CCAAGGAATG CAGAAGTTCC TTGGGTTGGA 240 

GOTGACAGGG AAGCTAGACA CTQACACTCT GGAGGTGAT6 a3CAAGCXX:A GGTGTGGftGT 300 

TCCTQACGTT GGTCACTTCA GCTCCTTTCC TGGCATGCOG AAGTGG AGGA AAACCCACCT 360 

TACATACAGG ATTOTGAATT ATACACCAGA TTTGCCAAGA GATGCTGTTO ATTCTGCCAT 420 

TGAGAAAGCT CT6AAAGTCT GGGAAGAGGT GACTCCACTC ACATTCTCCA GGCT GT ATGA 480 

40 AGGAGAGGCT GATATAATGA TCTCTTTOQC AGTTAAAOAA CATGOAGACT TTTACTCTTT 540 

TGATG6CCCA GGACACAGTT T6GCTCATGC CTACCCACCT GGACCTGGGC TTTATGGAGA 600 

TATTCACTTT GATGATGATG AAAAATGGAC AGAAQATGCA TCAGGCACCA ATTTATTCCT 660 

CGTTGCTGCT CATGAACTTG GCCACTCOCT GGGQCTCTTT CACTCAGCCA ACACTGAAGC 720 

TTTGAT6TAC CCACTCTACA ACTCATTCAC AGAGCTOGCC CAOTTCOGCC TTTOGCAAGA 780 

45 TGATGTQAAT GGGATTCAOT CTCTCTACQ6 ACCTCCCCCT 6CCTCTACTG AGQAAC OCCT 840 

GOTGOCCACA AAATCTOTTC CTTCaOQATC TGAOATGCCA OCCAAGTGTG ATCCTGCTTT 900 

GTCCTTCGAT GCCATCAGCA CTCTGAGGGG AGAATATCTG TTCTTTAAAG ACAGATATTT 960 

TTGGCQAAGA TCCCACTGGA ACCCTGAACC T6AATTTCAT TTGATTTCTG CATTTTGGCC 1020 

CTCTCTTCCA TCATATTTGG ATGCTGCATA T6AAGTTAAC AGCAGGGACA CCCTTTTTAT 1080 

50 TTTTAAAGGA AATGAGTTCT GGGCCATCA6 AGOAAATGAG GTACAAGCAG GTTATCCAAG 1140 

AOGCA,TGCAT ACCCTGGGTT TTCCTCXSiAC CATAAGQAAA ATTGATGCAG CTGTTTCTGA 1200 

CAAGGAAAAG AAGAAAACAT ACTTCTTTGC AGGGGACAAA TACTGGAGAT TTGATGAAAA 1260 

TAGCCAGTCC ATGQAGCAA6 GCTTCCCTAO ACTAATAGCT QATGACTTTC CAGGAGTTGA 1320 

GCCTAA3GTT GATGCT6TAT TAOiGGCATT TQGATTTTTC TACTTCTTCA GTGGATCATC 1380 

55 ACAGTTTQAO TTT6ACCCCA ATGOCAGQAT 60TGACACAC ATATTAAAGA GTAACAGCTG 1440 

GTTACMTOC TAGGCGAGAT AGGGGGAAGA CAGATAT6GG TGTTTTTAAT AAATCTAATA 1500 

ATTATTCATC TAATGTATTA TGAQCCAAAA TGGTTAATTT TTCCTGCATG TTCTGTGACT 1560 

QAAQAAGATG AGCCTTGCAG ATATCTGCAT GTGTCATGAA GAATGTTTCT GGAATTCTTC 1620 

ACTTGCTTTT GAATTGCACT GAACAGAATT AAGAAATACT CATGTGCAAT AGGTGAGAGA 1680 

60 ATGTATTTTC ATAGATGTGT TATTACTTCC TGAATAAAAA GTTTTATTTT GG6CCTGTTC 1740 
CTT 



Seq ID NO: 389 Protein sequence 
Protein Accession #: NP 002416 



11 21 31 41 51 

MHLAFLVIiIiC LPVCSAYPI.8 GAAKBEDSNK DLAQOYLEKY YNLEKDVKQF RRKDSNLIVR 60 

KIQGMQKFLG LBVTGKLDTO TIiEVMRKPRC GVPDVGHFSS PPOIPRWRKT HLTYRIVNYT 120 

70 PDLPRDAVDS AIEKALKVHE EVTPLTPSRL YEGEADIMIS FAVKEHGDFY 8FDGPGHSLA 180 

HAYPPGPGLY GDIHFDDDCK WTEDASGTNL FLVAAHELGH SLGLFHSANT' EAIiMYPIiYNS 240 

FTELAQFRLS QODVNGIQSL YGFPPASTEE PLVPTKSVPS G8EMPAKCDP ALSFDAI8TL 300 

RGEYLFPKDR YFWRRSHHNP EPBFHLIBAF NPSLPSYLDA AYEVNSR23TV FIFKGNBFKA 360 

IRGNEVQAGY PRGIHTLGFP FTIRKIDAAV SDXBECKICiyF FAADKYWRFD ENSQSKEQGF 420 

75 PRLIAODFPG VEPKVDAVLQ AFGFFYFFSO SSQFBFDPNA RMVTKIIiKSN SWLBC 



Seq ID NO: 390 ONA sequence 

Nucleic Acid Accession #t NM_002421.2 

Coding sequence t 1..1409 



1 11 21 31 41 51 

I I I I i i 

ATGCACAGCT TTCCTCCACT GCTGCTGCTG CT6TTCTGGG 6TGTGGTGTC ACACAGCTTC 60 

CCAGOGACTC TAGAAACACA AGAGCAAGAT 6TGGACTTAG TCCAGAAATA CCTGOAAAAA 120 

85 TACTACAACC TGAAGAATGA TGGGAGGCAA GTTGAAAAGC GQAGAAATAG TGGCCCAGTG 180 

GTTGAAAAAT TGAAGCAAAT GCAGGAATTC TTTGGGCrGA AAGTGACTGG GAAACCA6AT 240 

GCTGAAACCC TGAAGGTGAT GAAGCAGCCC AOATGTGGAG TGCCTGATGT GGCTCAGTTT 300 
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GTCCTCACTG AGGGGAACCC TGGCTQ6GA6 CAAACACATC 
TACAGGCCAO ATTTGCCAAO A6CA0AT6T6 OACCATQCCA 
TGSAQTAMG TCACACCTCT GACATTCftCC AAGGTCTCTO 
ATATCTTTTG TCA6GGGA6A TCATC6GGAC AACTCTCCTT 
CTTGCTCATG CTTTTCAACC AGGCXTCAGGT ATTOGAGGGG 
GAAAGGTGGA CCAACAATTT CAGAGAGTAC AACTTACATC 
GQCCATTCTC TTGGACTCTC OCATTCTACT GATATGG6GG 
ACCTTCAGTG GTGATGTTCA GCTAQCTCAG QATGACATTG 
6GACGTTCCC AAAATCCTGT CCAGCCCATC GGCCCACAAA 
AAGCTAACCT TTGATGCTAT AACTACGATT CGGGGAGAAG 
TTCTACATGC GCACAAATCC CTTCTACXCG GAAGTTGAGC 
TGGCCACAAC IGCCAAATGQ GCrTGAAGCT QCTTACGAAT 
G60TTTTTCA AAGGGAATAA GTACTGGGCT GTTCAGGGAC 
CCCAAGGACA TCTACAGCTC CTTTGGCTTC CCTAQAACT6 
CTTTCTGAGG AAAACACTGG AAAAACCTAC TTCTTTGTTQ 
GATGAATATA AACGATCTAT GGATCCAGGT TATCCCAAAA 
GGAATTGGCC ACAAAGTT6A TGCAGTTTTC ATGAAAGATO 
GGAACAAGAC AATACAAATT TGATCCTAAA AOSAAGAQAA 
AATAGCT6GT TCAACTGCAG GAAAAATTAG 

Seg ID NO: 391 Protein sequence 
Protein Accession #: NP 002412.1 



T6ACCTACAG 
TTGAGAAAGC 
A6GGTCAAGC 
TTGATGGACC 
AT6CTCATTT 
GTGTT6CGGC 
CTTTGATGTA 
ATGGCATCCA 
CCCCAAAAGC 
TGATGTTCTT 
TCAATTTCAT 
TTGCOGACAO 
A6AAT0T6CT 
TGAA6CATAT 
CTAACAAATA 
TGATAGCACA 
GATTTTTCTA 
TTTTGACTCT 



GATT6AAAAT 
CTTCCAACTC 
A6ACATCAT6 
TGGAGGAAAT 
TGATGAAGAT 
TCATGAACTC 
CCCTAGCTAC 
AGCCATATAT 
AT6TGACAGT 
TAAAGACAGA 
TTCTGTTTTC 
AGATOAAOTC 
ACACGGATAC 
C6ATGCTGCT 
CTGGA6GTAT 
TGACTTTCCT 
TTTCTTTCAT 
CCAGAAAGCT 



PCT/US02/12476 



1 
I 

MHSPPPLLLL 
VEKLKQMQBP 
YTPDLPRADV 
IiAHAFQPGPG 
TFSGDVQLAQ 
FYMRTHPFYP 
PKDIYSSFQP 
GIGRKVDAVF 



11 
I 

LPW6WSRSF 
PGLKVTGKPD 
DHAIEKAFQL 
IGGDAHFDBD 
DDIDGIQAiy 
EVELNPISVP 
PRTVKBXDAA 
MKXX3FPYFFH 



21 

1 

PATLETQEQD 
AETUCVMKQP 
WSNVTPLTFT 

QiSQNFVQPI 
WPQIiPNGLBA 

LSBBtrrGKTy 

GTRQYKFDPK 



31 
I 

VDtiVQKyiiEK 
ROGVPDVAQF 
KVSBGQADIM 
NLHRVAAHBL 
GPQTPKACDS 
AYEPADRDEV 
FFVAIinCYHRY 
TKRILTLQKA 



41 
I 

yyiTLKMDGRQ 
VLTEGNPRWE 
ISFVRGDHRD 
GHSLGLSHST 
KLTBDAITTI 
RFFXaNKYHA 
DBYKRSHDP6 



SI 
1 

VEKRRKSGPV 
QTHLTYRIBN 
NSPFDGPG(»T 
DIGALNYPSY 
RGBVHFFKDR 
VQGQMVLKGY 
YPKMIAHDFP 



Seg ID HO: 392 DNA seguence 

Nucleic Acid Accession «: NM_002421.2 

Coding seguence t 1..1409 



1 
I 

ATGCACAGCT 
CCAGCGACTC 
TACTACAACC 
GTTGAAAAAT 
GCTGAAACCC 
GTCCTCACTG 
TACACGCCAG 
TGGAGTAATG 
ATATCTTTTG 
CTTGCTCATG 
GAAAGGTGGA 
GGCCATTCTC 
ACCTTCAGTG 
GGAG6TTCCC 
AAGCTAACCT 
TTCTACATGC 
TGGCCACAAC 
OOGnXTTCA 
CCCAAGGACA 
CTTTCTGAGG 
GATGAATATA 
GGAATTGGCC 
GGAACAAGAC 
AATAGCTGGT 



11 

I 

TTCCTCCACT 

TAGAAACACA 
TGAAGAATGA 
TGAAGCAAAT 
TGAAGGTGAT 
AGGGGAACCC 
ATTTGCCAAG 
TCACACCTCr 
TCAGGGGA6A 
CTTTTCAACC 
CCAACAATTT 
TTGGACTCTC 
GTGATGTTCA 
AAAATCCTGT 
TTGATGCTAT 
GCACAAATCC 
T6CCAAATG6 
AAGGGAATAA 
TCTACAGCTC 
AAAACACTGG 
AACGATCTAT 
ACAAA6TTGA 
AATACAAATT 
TCAACTGCAG 



21 
i 

GCTGCTOCTO 
AGAGCAAGAT 
TGGGAGGCAA 
GCAGGAATTC 
GAAGCAGCCC 
TC6CTGQQAG 
AGCAGATGTG 
GACATTCACC 
TCATCGGGAC 
AGGCCCAGGT 
CAGAGAGTAC 
CCATTCTACT 
GCTAGCTCAG 
CCAGCCCATC 
AACTACGATT 
CTTCTACCCG 
GCTTGAAGCT 
GTACTGGGCT 
CTTTGGCTTC 
AAAAACCTAC 
GGATCCAGGT 
TGCAGTTTTC 
TGATCCTAAA 
GAAAAATTAG 



31 

r 

CT6TTCTGGG 
GTGGACTTAG 
GTTGAAAAGC 
TTTGGGCTGA 
AGATGTGGAG 
CAAACACATC 
GACCATGCCA 
AAGGTCTCTG 
AACTCTCCTT 
ATTGGAGGGG 
AACTTACATC 
GATATCGGGG 
QATGACATTG 
GGCCCACAAA 
CGGGGAGAAG 
GAAGTTGAGC 
GCITACGAAT 
GTTCAGGGAC 
CCTAGAACTG 
TTCTTTGTTQ 
TATCCCAAAA 
ATGAAAGATG 
A06AAGAGAA 



41 
I 

0TGTG6TGTC 
TCCAGAAATA 
GGAGAAATAG 
AAGTGACTGG 
TGCCTGATGT 
TQACCTACAG 
TTGAGAAAGC 
AGGGTCAAGC 
TTGATGGACC 
ATGCTCATTT 
GTGTTGCGGC 
CTTTGATGTA 
ATGGCATCCA 
CCCCAAAAGC 
TGATGTTCTT 
TCAATTTCAT 
TTGCCGACAG 
AQAATGTGCT 
TGAAGCATAT 
CTAACAAATA 
TGATAGCACA 
GATTTTTCTA 
TTTTGACTCT 



51 

I 

ACACAGCTTC 
CCTGGAAAAA 
TGGCCCAGTG 
GAAACCAGAT 
G6CTCA0TTT 
GATTGAAAAT 
CTTCCAACTC 
AGACATCATG 
TGGAGGAAAT 
TGATGAAGAT 
TCATGCCCTC 
CCCTAGCTAC 
AGCCATATAT 
ATGTGACAGT 
TAAAGACAGA 
TTCTGTTTTC 
AGAT6AAGTC 
ACACGGATAC 
CGATGCTGCT 
CTGGA6GTAT 
TGACTTTCCT 
TTTCTTTCAT 
CCAGAAAGCT 



Seg ID NO: 393 Protein seguence 
Protein Accession «t KP 002412.1 



MHSPPPLLLL 
VEKLKQMQEF 
YTPDLPRADV 
IiAHAFQPGPG 
TFSGDVQLAQ 
FYMRTHPFYP 
PKDIYSSPOF 
GIGHKVDAVF 



11 
I 

LPWGWSHSF 
PGLKVTGKPD 
DHAIEKAFQL 
IGGDAHFDED 
DDIDGIQAIY 
EVELNFISVF 
PRTVKHIDAA 
HKDGPFYFFH 



21 
I 

PATLETQEQD 
AETLKVMKQP 
WSNVTPLTFT 
ERWTNNFREY 
GRSQHPVQPI 
HPQLPHGLBA 
LSEEHTGKTY 
GTRQYKFDPK 



31 

I 

VDLVQXYLEK 
ROGVPDVAQF 
KVSEGQADIH 
NLHRVAAHAL 
GPQTPKACDS 
AYEFADRDEV 
FFVANKYNRY 
TKRILTLQKA 



41 
I 

YVNLKNDGRQ 
VLTBGHPRWE 
ISFVRGDHRD 
(aiSLGLSHST 
KLTFDAITTI 
RFFK6NKYWA 
DEYKRSMDPG 



360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1060 
1140 
1200 
1260 
1320 
1380 



60 
120 
180 
240 
300 
360 
420 



60 

120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 



51 
1 

VEKRRNS6PV 60 

QTHLTYRIEN 120 

HSPFDGPGGH 180 

DIGALMYPSY 240 

RGEVMPFKDR 300 

VQGQHVLHGY 360 

YPKMIAHDFP 420 



85 



Seg ID NO: 394 DNA seguence 

Nucleic Acid Accession ft: NM_014331.2 

Coding seguence: 1..1506 ~ 



11 
I 



21 



31 
I 



41 

I 



51 

1 
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wo 02/086443 

ATGGTCAGAA AGCCTGTTGT GTCCACCATC TCCAAAGGAG GTTACCrQCA GGGAAATGTT 60 

AAOGGXSftGGC TGCCTTCOCT GGGCAACAAG GAGCCACCTG GGCAGGAGAA AGTGCAQCTG 120 

AAGAOGftAAG TCACTTTACT GAGGGGAGTC TCCATTATCA TTG6CACCAT CATTGGAGCA 180 

GGAATCTTCA TCTCTCCTAA GGGCGTGCTC CAGAACACX5G GCAGCGTGGO CATGTCTCTG 240 

ACCATCTGGA CGGTGTGTGQ GGTCCTGTCA CTATTTGGAG CnTGTCTTA TGCTG AATTG 300 

GGAACAACTA TAAAGAAATC TGGA6GTCAT TACACATATA TTTTGGAAGT CTTTOGTCCA 360 

TTACCAGCTT TT6TACGAGT CTQGOTGQAA CTCCTCATAA TAOSCCCTGC AGCTACTGCT 420 

GTGATATCCC TG6CATTTGQ ACX3CTACATT CPGOAACCAT TTTTTATTCA ATGTGAAATC 480 

CCTGAACTTO CGATCAAGCT CATTACAGCT GTGGGCATAA CTGTAGTGftT GGTCCTAAAT 540 

AGCATGAOTQ TCAGCT6GAG OGCCOGQATC CAGATTTTCT TAA0CTTTT6 CAAQCTCACA 600 

GCAATTCTQA TAATTATAGT CCCTGGAGTT ATGCAGCTAA TTAAAG6TCA AACQCAGAAC 660 

TTTAAAGACG OGTTTTCAGG AAGAGATTCA AGTATTAGGC GGTTGOCACT GGCTTTTTAT 720 

TATGOAATGT ATGCATATGC TGGCTGGTTT TACCTCAACT TTGTTACTGA AGAAGTAGAA 780 

AACCCTGAAA AAAOCATTCC CCTTGCAATA TGTATATCCA TG6CCATTGT CACC ATTGO C 840 

TATGTGCTGA CAAATGTGGC CTACTTTACG ACCATTAATG CTOAGOAGCT GCTSCTTTCA 900 

AATOCAGTGG CAGTGACCTT TTCTGAGCGO CTACT6GQAA ATTTCTCATT fiGCAGTTCCG 960 

ATCTTTGTTG CCCTCTCCTG CTTTGGCTCC ATGAAOGGTG GTGTGTTTGC T6TCTCCAGG 1020 

TTATTCTATG TTGCGTCTCG AGAGGGTCAC CTTCCAGAAA TCCTCTCCAT GATTCATGTC 1080 

CGCAAGCACA CTCCTCTACC AGCTGTTATT GTTTTGCACC CTTTOACAAT Q ATAATG CTC 1140 

TTCTCTGGAO AOCTCGACAG TCTTTTOAAT TTCCTCAGTT TTGCCAGGTG GCTTTTTATT 1200 

GGGCTG6CAG TTGCTGGGCT GATTTATCTT 06ATACAAAT GCCCAGATAT GCATCGTCCT 1260 

TTCAAGGTX3C CACTGTTCAT CCCAGCTTTG TTTTCCTTCA CATGCCTCTT CATGGTTGCC 1320 

CTTTCCCTCT ATTOGGACCC ATTTAOTACA GOGATTGGCT TCGTCATCAC TCTGACTGGA 1380 

GTCCCTG06T ATTATCTCTT TATTATATQG GACAA6AAAC CCAGGTGGTT TAGAATAATG 1440 

TCAGAGAAAA TAACCA6AAC ATTACAAATA ATACTGGAAG TTGTACCA6A AGAA6ATAA6 1500 

TTATGAACTA ATGGACTTGA GATCTTGGCA ATCTGCCCAA GGGGAGACAC AAAATAGGGA 1560 

TTTTTACTTC ATTTTCTGAA AGTCTAGAGA ATTACAACTT TGGTGATAAA CAAAAGGAOT 1620 

CAGTTATTTT TATTCATATA TTTTAGCATA TTCGAACTAA TTrCTAAGAA ATTTAGTTAT 1680 

AACTCTATGT AGTTATAGAA ASTGAATATG CAGTTATTCT ATQAOTCGCA CAATTCTTGA 1740 

QTCTCTGATA CCTACCTATT GGGGTTA6GA GAAAAOACTA GACAATTACT ATGTGOTCAT 1800 

TCTCTACAAC ATATGTTAGC AOGGCAAAGA ACCTTCAAAT TGAAGAGTCSA QATTTTTCTO 1860 

TATATATGGG TTTTGTAAAG ATGGTTTXAC ACACTACAGA TQTCTATACT GTGAAAAGTG 1920 

TTTTCAATTC TGAAAAAAAG CATACATCAT QATTATGGCA AAGAGGAGAG AAAGAAATTT 1980 

ATTTTACATT GACATTGCAT TGCTTCCCCT TAGATACCAA TTTAGATAAC AAACACTCAT 2040 

GCTTTAATGG ATTATACOCA 6AGCACTTTQ AACAAAGGTC AGTGGG6ATT GTTGAAT ACA 2100 

TTAAAGAAGA GTrrCTAGGG GCTACTGTTT AT6AGACACA TCCAGGAGTT ATGTTTAA6T 2160 

AAAAATCCTT GAGAATTTAT TATGTCAGAT GTTTTTTCAT TCATTATCAO GAAGTTTTAG 2220 

TTATCTGTCA ' i ' lTni ' l ' TTi ' TCACATCAGT TTGATCAGGA AAGT6TATAA CACATCTTAG 2280 

A6CAAGAGTT AGTTTGGTAT TAAATCCTCA TTAGAACAAC CACCTGTTTC ACTAATAACT 2340 

TACCCXnX3AT GAGTCTATCT AAACATATGC ATTTTAAGCC TTCAAATTAC ATTATCAACA 2400 

TGAGAGAAAT AACCAACAAA GAAGATGTTC AAAATAATAQ TCGCATATCT GTAATCATAT 2460 

CTACATGCAA TGTTAGTAAT TCTGAAGTTT TTTAAATTTA TGQCTATTTT TACACGATGA 2S20 

TGAATTTTGA CAGTTTGTGC ATTTTCTTTA TACATTTTAT ATTCTTCTGT TAAAATATCT 2580 

C3TCA6ATGA AACT6TCCAG ATTAATTAGG AAAAQGCATA TATTAACATA AAAATTGCAA 2640 

AAGAAATGTC GCIGTAAATA AGATTTACAA CTGATGTTTC TAGAAAATTT CCACTTCTAT 2700 

ATCTAG6CTT TGTOMSPAAT TTCCACAOCT TAATTATCAT TCAACTTGCA AAAORGACAA 2760 

CTGATAAGAA QAAAATTGAA ATQAQAATCT GTG6ATAAGT GTTTGTGTTC AGAAGATGTT 2820 

GTTTTGCCAG TATTAGAAAA TACTGTGAGC CXSGGCATGGT GGCTTACATC TGTAATCCCA 2880 

QCACTTTGGG AGGCTGAGGG GGTGGATCAC CTGAOGTCGQ GAGTTCTAGA CCAGCCTGAC 2940 

CAACATGGAO AAACCCCATC TCTACTAAAA ATACAAAATT A6CTGGGCAT GGTGGCACAT 3000 

GCTGSTAATC TCAGCTATXG ACSGAGGCTGA GGCAGGAGAA TIGCTTGAAC CCGGGAGGCG 3060 

GAGGTTGCAG TGAGCCAAGA TT6CACCACT GTACTCCAGC CTGGGT6ACA AAGTCAQACT 3120 
CCATCTCCAA AAAAAAAAAA AAAA 

Seq ID NO: 395 Protein sequence 
Protein Accession #s NP_05514e.l 

1 11 21 31 41 51 

I I I I I I 

HVRKPWSTI SKGGYLQGNV NGRLPSLGNK EPPGQBKVQL KRKVTLLRGV 8IIIGTII6A 60 

GIFISPK6VL QNTGSVGMSL TIWTVCGVLS LFGALSYAEL GTTIKKSGGH YTYILEVFQP 120 

LPAPVRVWVE IiLIIRPAATA VISIAFGRYl LBPPFIQCBI PBLAIKLITA VGITWMVtN 180 

SMSVSWSARI QIFLTFC3CLT AILIIIVPGV MQLIK6QTQN FKDAFSGRDS SITRIiPIAFY 240 

Y01YAYAGWP YLNPVTEEVE NPEKTIPLAI CISMAITIGV YVLTNVAYPT TIHAEELLLS 300 

NAVAVTFSER LLGNFSIAVP IFVALSCFGS MNGGVFAVSR LFYVASREGH liPEILSMIHV 360 

RKHTPLPAVI VLHPLTMIMIj FSGDLDSLm FLSPARWLFI GIiAVAGLIYL RYKCPDMHRP 420 

FKVPIiFIPAL FSFTCLPMVA LSLYSDPFST GIGFVITLTQ VPAYYLFIIW DKKPRWFRIH 480 
SEKITHTLQI ILBWPBEDK L 



Seg ID NO: 396 DNA sequence 
Nucleic Acid Accession #i NM_006528 
Coding sequence t 57.. 764 

1 11 21 31 41 51 

I I I I i I 

GCCGCCAGCG GCTTTCTCGG AOGCCTTGCC CAGOG 6GO0G CCCGACCCCC TGCACCATG6 60 

ACCCGGCT06 OCCOCTGGGG CTGTOQATTC TOCTOCTTTT CCTGAOGGAO 6CTGCACIGQ 120 

QOQATGCTQC TCAG6A6GCA ACAGGAAATA AGGOGGAGAT CTOTCTCCTG CCCCTA6ACT 180 

AOGGACCCTG CCGGGCCCTA CTTCTCCGTT ACTACTAOGA CAGGTACACG CAGAGCTGCC 240 

GCCAGTTCCT GTACQGGGGC TGCQAGGGCA ACGCCAACAA TTTCTACACC TGGGAGGCTT 300 

G0GAC6ATGC TTGCTGGAGG ATAGAAAAAG TTCCCAAA6T TTGCCGGCTG CAAGTGAGT6 360 

TGGAOSACCA GIGTGAOOGG TOCACAGAAA AGTArTTCTT TAATCTAAGT TCCATGAGAT 420 

GTGAAAAATT CTTTTCCGGT GGGTGTCACC GGAACCQ6AT TGAGAACAGG TTTCCAaATG 480 

AAGCTACTTG TATGGGCTTC TGCGCACCAA AGAAAATTCC ATCATTTTOC TACAGTCCAA 540 

AAGATGAGGG ACTGTGCTCT GCCAATGTGA CTCGCTATTA TTTTAATCCA AGATACAGAA 600 

CCTGTGAT6C TTTCACCTAT ACCGGCTGTG GAGGGAATGA CAATAACTTT GTTAGCAGGG 660 
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AQGATTGCAA A0C3TGCATGT GCAAAAGCTT T6AAAAAGAA AAAGAAGATG CCAAA6CTTC 720 

GCrrPGCCAG TAGAATCCGG AAAATTCGGA AGAAGCAATT TTAAACATTC TTAATATGTC 780 

ATCTTGTTTG TCTTTATGGC TTATTTGCCT TTATGGTTGT ATCTGAAGAA TAATATGACA 840 

GCATGAGGAA ACAAATCATT GGTOATTIAT TCACCAGTTT TTATTAATAC AAGTCACTTT 900 

TTCAAAAATT TOOATTTTTT TATATATAAC TAGCTGCTAT TCAAATGTQA GTCTACCATT 960 

TTTAATTTAT GOTTCAACTQ TTTGrOAOAC GAATTCTTGC AATGCATAAG ATATAAAAOC 1020 

AAAtATGACT CACTCATTTC TTGGGGTOGT ATTCCT6ATT TCAGAAOAOG ATCATAACTG 1080 

AAACAACATA AOACAATATA ATCATOTGCT TTTAACATAT TTGAGAATAA AAAGGACEAG 1140 
CC 



Seg ID HO: 397 Protein sequence 
Prpteia Accession NP_006519 

1 11 21 31 41 51 

I I i I I 1 

MDPARPLGLS IIiLIiFLTEAA L6DAAQBPT6 NNAEICLLPL DYGPCRALLL RyYYDRYTQS 60 

CRQFLYGGCE GNANNFYTWS ACDDACWRIE KVPKVCRLQV SVDDQCEGST BKYFPNLSSM 120 

TCEKPPSGGC HRNRIENRPP DEATCMGFCA PKKIPSFCYS PKDEGLCSAN VTRYYENPRY 180 
RTCDAFTYTG CXKSNDKNFVS REDCKRACAK ALKKKKKMPK IiRPASRIRKI RKKQF 

Seq ZD NOt 398 DNA sequence 

Nucleic Acid Accession #; im_001508.1 

Coding 8eq^ence i 1 . . 1361 

1 11 21 31 41 51 

I I I I 1 I 

ATGGCTTCAC CCAGCCTCCC GGGCAGTGAC TGCTCCCAAA TCATTGATCA CAQT CATGTC 60 

CCCGAGTTTG AGGTQGCCAC CTGGATCAAA ATCACCCTTA TTCTGGTGTA CCTGATCATC 120 

TTOGTGATGG GCCTTCTGGG GAACAGCGTC ACCATTCGGG TCACCCAGGT 6CTGCAGAAG 180 

AAAGGATACT T6CAGAAGGA GGTGACAGAC CACATGGTGA GTTTGGCTTG CTGG6ACATC 240 

TTGGTGTTCC TCATCQGCAT GCCCATGGAG TTCTACAOCA TCATCTGQAA TCCCCTGACC 300 

AOGTCCAGCT ACACCCTGTC CTGCAAGCTG CACACTTTCC TCTTCGAGGC CTGCAfiCTAC 360 

GCTACGCTGC TGCACGTGCT GACGCTCAGC TTTGAGCGCT ACATC6GCAT CtGTCAGCCC 420 

TTCAGGTACA AfiGCTGTGTC GGGACCTTGC CAQGTGAAGC TGCTGATTGG CTTCGTCTGG 480 

GTCACCTCGG CCCTGGTGGC ACTGCCCTTG CTGTTTGCCA T6GGTACTGA GTACCCCCTG 540 

GTGAACGT6C CCAGCCACCG GGQTCTCACT TGCAACOGCT CCAGCACCCG CCACCACQAG 600 

CAGCCCGAGA CCTCCAATAT GTCCATCTGT ACCAACCTCT CCAGCCGCTG GACC6TGTTC 660 

CAGTCCAGCA TCTTCGGCGC CTTCGTGGTC TACCTCOTGG TCCTGCTCTC GSTAGCCtTC 720 

ATGTGCTGGA ACATGATGCA GGTGCTCATG AAAAGCCAGA AGGGCT06CT GGC066GGGC 780 

ACGCGGCCTC OGCAGCTGAQ GAAGTCCGAG AGCGAAGA6A GCAGGACCGC CAGGAGGCAG 840 

ACCATCATCT TCCTGA6GCT GATTGTTGTG ACATTGOCCG TATGCTGQAT QCCCAACCAG 900 

ATTCGGAGGA TCATGQCTGC GGCCAAACCC AA6CA06ACT GGAC6AG6TC CTACTTCCGG 960 

GCGTACATGA TCCTCCTCCC CTTCTCG6A0 ACOTrPTTCT ACCTCAQCTC GGTCATCAAC 1020 

CCGCTCCTGT ACACGGTGTC CTOGCAGCAG TTTCGGCQGG TGTTCGTGCA QGTGCTGTOC 1080 

TQCCGCCTGT CGCTGCAGCA CGCCAACCAC GAQAAGCGCC TGCGCGTACA TGCGCACTCC 1140 

ACCACCGACA GCGCCCGCTT TGTGCAGCGC CCGTTGCTCT TOGCGTCCCG GCGCCAGTCC 1200 

TCTGCAAGGA GAACTGAGAA GATTTTCTTA AGCACTTTTC AGAGCGAGGC CGAGCCCCAG 1260 

TCTAAOTCCC AgTCATTOAO TCTCSAOTCA CIAQAG CCCA ACTGAGGCGC GAAACCAQCC 1320 
AATTCTGCTG CAGASAATGG TTTTCA06A6 CATGAAGTTT GA 



Seg ID NO: 399 Protein sequence 
Protein Accession «: NP_001499.1 

1 11 21 31 41 51 

I i i 1 1 I 

MASPSLPGSD CSQIIDHSHV PEFBVATWIK ITLILVYLII FVMGIiLGNSV TIBVTQVLQK 60 
KGYLQKEVTD HMVSLACSDI LVFLIGMPME FYSIIWNPLT TSSYTliSCKIj HTFLFEACSY 120 
ATLLHVIiTIiS PERYIAICHP FRYXAVS6FC QVKUilOPVW VTSAXiVAIiPIi LPAMGTEYPL 180 
VNVPSHRGIiT CNRSSTRHHE QPETSNHSIC TNLSSRHTVF QSSIF6AFW YLWLLSVAF 240 
MCWNMMQVLM KSQKGSLAQG TRPPQLRKSE SEESRTARRQ TIIFLRLIW TLAVCWMPMQ 300 
IRRIMAAAKP KHDHTRSYFR AYMILLPFSB TFPYLSSVIN PLLYTVSSQQ FRRVFVQVLC 360 
CRLSIiQHAHH EKRLRVHAHS TTDSARFVQR PLLFASRRQS SASRTBKIFL STFQSEAEPQ 420 
SKSQSLSLGS LEPN9GAKPA NSAAENGFQE HEV 



Seq ID NO: 400 DNA sequence 

Nucleic Acid Accession #: 1]M_006475.1 

Coding sequence : 2 8 .. 2 53 a 

1 11 21 31 41 51 

1 1 I 1 I- I 

AACAGAACTG CAAC66AGAG ACTCAAGATG ATTCCCTTTT TAOCCATGTT TTCTCTACTA 60 

TTGCTGCTTA TTGTTAACCC TATAAACGCC AACAATCATT ATGACAAGAT CTTGGCTCAT 120 

AGTCGTATCA GGGGTCGGGA CCAAGGCCCA AATGTCTGTG CCCTTCAACA GATTTTGGGC 180 

ACCAAAAAGA AATACTTCAG CACTTGTAAG AACTGGTATA AAAAGTCCAT CTGTGGACAG 240 

ATWUICGACTG TTTTATATGA ATGTTGCCCT GGTTATATGA GAATGGAAGG AATGAAAGGC 300 

TGCCCAGCAG TTTTGCCCAT TGACCATGTT TATG6CACTC TGGGCATCOT GGOAGCCACC 360 

ACAAC6CAGC GCTATTCTGA CGCCTCAAAA CTGAGG6A66 A6ATCGAGGG AAA6GQATCC 420 

TTCACTTACT TTGCACCGAG TAATGAGGCT TGGGACAACT TGGATTCTGA TATCCGTAGA 480 

GGTTTGGAGA GCAACGTGAA TGTTGAATTA CTGAATGCTT TACATAQTCA CATGATTAAT 540 

AAGAQAATGT TGACCAAGGA CTTAAAAAAT GGCATGATTA TTCCTTCAAT GTATAACAAT 600 

TTGGGGCTTT TCATTAAGCA TTATCCTAAT GGGGTTOTCA CT6TTAATTG TGCTOQAATC 660 

ATCCATGGGA ACCAGATTGC AACAAATGOT GTTQTCCATG TCATTGACOS TGTGCTTACA 720 

CAAATTGGTA CCTCAATTCA AGACTTCATT GAAGCAGAAG AT6ACCTTTC ATCTTTTAGA 780 

GCAGCTGCCA TCACATCGGA CATATTGGAG GCCCTTGGAA GAGAOGGTCA CTTCACACTC 840 

TTTGCTCCCA CCAATGAGGC TTTTGAGAAA CTTCCAOGAG GTGTCCTAGA AAGGTTCATG 900 

GGAOACAAAG TGGCTTCOGA AGCTCTTATG AAGTACCACA TCTTAAATAC TCTCGAGTGT 960 

TCTGAOTCrA TTAT6GGAG0 AGCAGTCTTT GA6AC6CTG3 AAGQAAATAC AATTGAGATA 1020 
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GGATGTOACO GTGACAGTAT AACAGTAAAT GGAATCAAAA TQ6TGAACAA AAAGGATATT 1080 

OTOACAAATA ATGGTGTGAT CCATTTGATT GATCAOOTCC TAATTCCTGA TTCTGCCAAA 1140 

CRAGTTATTG AGCrGGCTGO AAAACAGCAA ACCACCTTCA OGQATCTTGT GGCCCAATTA 1200 

GGCTTX3GCAT CTGCTCTGAG GCCAGATGGA GAATACACTT TOCTGGCACC TGTGAATAAT 1260 

GCATTTTCTG ATGATACTCT CRGCATGOTT CAGCQCCTCC TTAAATTAAT TCTGCAGAAT 13 20 

CACATATTGA AAGTAAAAGT TGGCCTTAAT GAGCTTTACA AOGGGOUVAT ACTGGAAACC 1380 

AT0G6AG6CA AACA6CTCA0 AQ TCT TOQ TA TATCGCACAG CTGTCTGCAT T6AAAATTCA 1440 

TGCATGGAGA AAGGGAGTAA 6CAAGG6AGA AAOGGTGCGA TTCACATATT CCX3CGAGATC 1500 

ATCAAGCCAO CAQA6AAATC CCTCCATGAA AAGTTAAAAC AAGATAAGOG CTTTAGCACC 15 SO 

TTCCTCAGCC TACTTQAAQC TGCAGACTTG AAAGAGCTCC TGACACAACC TGGAOACTGa 1620 

ACATTATTTG T6CCAACCAA TGATGCTTTT AAG6GAATGA CTAGTGAAGA AAAAGAAATT 1680 

CTOATACGGO ACAAAAATGC TCTTCAAAAC ATCATTCTTT ATCACCTGAC ACCAGGAGTT 1740 

TTCATTGGAA AAGGATTTGA ACCTGGTGTT ACTAACATTT TAAAOACCAC ACAAGGAAQC 1800 

AAAATCTTTC TGAAAGAAGT AAATGATACA CTTCTGGTQA ATOAATTGAA ATCAAAAGAA 1860 

TCTOACATCA TGACAACAAA TGGTGTAATT CAT6TTGTAG ATAAACTCCT CTATCCAOGA 1920 

GACACACCTG TTGGAAATGA TCAACT6CTG QAAATACTTA ATAAATTAAT CAAATACATC 1980 

CAAATTAAGT TTGTTCGTGG TAGCACCTTC AAAGAAATCC C08TGACTGT CTATACAACT 2040 

AAAATTATAA CCAAAGTTGT GGAACCAAAA ATTAAAGTGA TTGAAGGCAG TCTTCAGCCT 2100 

ATTATCAAAA CTQAAGGACC CACACTAACA AAAGTCAAAA TTQAAG6TGA ACCT6AATTC 2160 

AGACTGATTA AAGAAGGTGA AACAATAACT GAAGTQATCC ATG6AGAGGC AATTATTAAA 2220 

AAATACACCA AAATCATTGA TGGAGTGCCT GTGGAAATAA CTGAAAAAGA GACAOGAGAA 2280 

6AAC3GAATCA TTACAGGTCC TGAAATAAAA TACACTAGGA TTTCTACTGG AGGTGGA6AA 2340 

ACA6AAGAAA CTCTQAAGAA ATTGTTACAA GAAGAGGTCA CCAAGGTCAC CAAATTCATT 2400 

GAAGGT6GTG ATGGTCATTT ATTTGAA6AT GAAGAAATTA AAAGACTGCT TCAGGOAOAC 2460 

ACACCOGTOA GGAAGTTGCA AGCCAACAAA AAA6TTCAAG GTTCTAOAAG AOGATTAAGG 2520 

GAAGGTOGTT CTCAGTGAAA ATCCAAAAAC CA6AAAAAAA TGTTTATACA ACCCTAAGTC 2580 

AATAACCTGA CCTTAGAAAA TTGTGAGAGC CAAGTTGACT TCAGGAACTO AAACATCAGC 2640 

ACAAAGAAGC AATCATCAAA TAATTCTGAA CACAAATTTA ATATTTTTTT TTCTGAATGA 2700 

GAAACATGAG GGAAATTGTG GAGTTAOCCT CCTOTGOTAA AGGA ATTQA A GAAAATATAA 2760 

CACCTTACAC CCTTTTTCAT CTTGACATTA AAAOTTCIGG CTAACTTTGG AATCCATTAG 2B20 

AGAAAAATCC TTGTCACCAG ATTCATTACA ATTCAAATGG AAGACTTGTG AACTGTTATC 2880 

CCATTGAAAA GACCGAGCCT TOTATQTATG TTATGGATAC ATAAAATGCA CGCAAGCCAT 2940 

TATCTCTCCA TGGGAAGCTA AGTTATAAAA ATAGGTGCTT GGTGTACAAA ACT TTTTATA 3000 

TCAAAA6GCT TTGCACATTT CTATATGAGT GGGTTTACTG GTAAATTATG T TATTTTTT A 3060 

CAACTAATTT TGTACTCTCA GAATGTTTGT CATATGCTTC TTGCAATGCA TATTTTTTAA 3120 

TCTCAAAOST TTCAATAAAA CCATTTTTCA GATATAAAGA G AATTA CTTC AAATTGAOTA 31B0 
ATTCAGAAAA ACTCAA6ATT TAAGTTAAAA AGTGGTTTGG ACTTGGGAA 

Seq ID NO: 401 Protein sequence 
Protein Accession #: NP_006466.1 

1 11 21 31 41 51 

I I 1 i I i 

KIPFLPHFSL LLLLIVNPIM ANMHYDKILA HSRIRGRDQG PNVCALQQIL GTKXKYFSTC 60 

KHMYKKSICG QKTTVLYBCC PGYNSMEGMK GCPAVZiPZDH VYGTIAIVaA TTTQRYSDA8 120 

KLREBIBGXG SFTYFAPSNE AHDZOiDSDZR RGZiBSNVNVE LUIALHSHMI MKRMLTRDIiK 180 

NGMIIPSMYN NLGLPINHYP NGWTVNCAR IIHGNQIATN GWHVIDRVL TQIGTSIQDP 240 

lEAEDDLSSP RAAAITSDIL EALGRDGHFT IiFAPTNBAPE KLPRGVLERP MGDKVASEAL 300 

MKYHILNTLQ CSESIMGGAV FBTLEGIITIB IGCDGDSITV NGIKMVNKKD IVTMNGVIHL 360 

IDQVLIPDSA ROYIBLAOKQ QTTFTDLVAQ LGLASALRPD GEYTLLAPVN NAFSDDTLSM 420 

VQRIiIiKLILQ NHZLRVKVGL NELYMGQIXiE TIG6KQLRVF VYRTAVCIER SCMEKGSKQ6 480 

RKGAIHIFRB IIKPAEKSLH EKLKQDKRFS TPLSIiLEAAD UCBLhTQ^GD WTLPVPTNDA 540 

FKGMTSEBKB ILIRDKNALQ NIILYHLTPG VPIGKGPEPG VTNILKTTQG SKIFLKEVND 600 

TLLVNELKSK ESDIMTTNGV IHWDKIiLyP ADTPVGNDQL LEILMKLIKY IQIKFVRGST 660 

PKEIPVTWT TKIITKWBP KIKVIEOSLQ PIIKIEQPTL TKVKIBGEPE FRLIKBGETI 720 

TBVIBGBPZI KKYTKIXDGV PVBITEKETR EERXITGPEI lOrTRZSTGGG ETEETLKXLL 780 
QEEVTKVTKP lEGGDGHLPE DEEIKRLLQG DTPVRKLQAH KECVQGSSRHL REGR5Q 

Seq ID NOt 402 UNA sequence 
Nucleic Acid Accession #s 11N_002416 
Coding sequence t 40.. 417 

1 11 21 31 41 51 

I I I I I I 

ATCCAATACA GGAGTGACTT GGAACTCCAT TCTATCACTA TGAAGAAA7«3 TGGTGTTCTT 60 

TTCCTCTTCG GCATCATCTT GCTGGTTCTG ATTQOAGTGC AAGGAACCCC AGTAGTGAGA 120 

AAGGGTCGCT GTTCCTGCAT CAGCACCAAC CAAGGGACTA TCCACCTACA ATCCTTGAAA 180 

GACCTTAAAC AATTTGCCCC AAGCCCTTCC T6CX3A6AAAA TTGAAATCAT TGCTACACTG 240 

AAQAATGGAG TTCAAACATG TCTAAACCCA GATTCAGCAG ATGTGAAGGA ACTGATTAAA 300 

AAGTGGGAGA AACAGGTCAG CCAAAAGAAA AAGCAAAAGA ATGGGAAAAA ACATCAAAAA 360 

AA6AAAGTTC TGAAAQTTCX3 AAAATCTCAA CGTTCTOGTC AAAAGAAGAC TACATAAGAQ 420 

ACCACTTCAC CAATAAQTAT TCTGTGTTAA AAATGTTCTA TTTTAATTAT A CCGCT ATCA 480 

TTCCAAAOGA GGATGGCATA TAATACAAAG GCTTA TTAAT TTQACTAQAA AATTTAAAAC 540 

ATTACTCTQA AATTGTAACT AAAGTTAGAA AGTTGATTTT AAGAATGCAA AGGTTAA6AA 600 

TTGTTAAAGG CTATGATTGT C ri'TG' r i'C n ' CTACXyVCCCA CCAQTTGAAT TTCATCATGC 660 

TTAAGGCCAT GATTTTAGCA ATACCCATGT CTACACAGAT GTTCACCCAA CCACATCCCA 720 

CTCACAACAG CTGCCTlBGAA GAGCAGCCCT AGGCTTGCAC GTACTGCAGC CTCCAGAGA6 780 

TATCIGA6GC ACATGTCAGC AAGTCCTAAG CCTOTTAGCA TGCTGGTQAG CCAAGCAGTT 840 

TGAAATTGAO CTGGACCTCA CCAAGCTGCT GTOGCCATCA ACCTCTGrAT TTGAATCAGC 900 

CTACAGQCCT CACACACAAT GTGTCTGAGA GATTCATGCT GATTGTTATT GGGTATCACC 960 

ACTGGAGATC ACCAGTGTGT QGCTTTCAGA GCCTCCTTTC TGGCTTT6QA AGCCATGTGA 1020 

TTCX3VTCTTG CXXX3CTCAG0 CTGACCACTT TATTTCTTTT TGTTCCCCTT TGCTTCATTC 1080 

AAGTCA6CTC TTCTCCATCX: TAGCACAATG CAOTGCCTTT CTTCTCTCCA GTGCACCTGT 1140 

CATATGCTCT GATTTATCTG AGTCAACTOC TTTCTCATCT T6TCCCCAAC AG0CCACA6A 1200 

AGTGCTTTCT TCTCCCAATT CATCCTCACT CAGTCCAGCT TAGTTCAA6T CCTGCCTCTT 1260 

AAATAAACCT TTTTGGACAC ACAAATTATC TTAAAACTCC TGTTTCACTT GQTTCAGTAC 1320 

CACATGGGIG AACACTCAAT GGTTAACTAA TTCTTGGGTG TTTATCCTAT CTCTCCAACC 1380 
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AQATTGTCA6 CTCCTT6AG6 6CAAGAGCCA 
CTAATAATAC TQTQOAACTA G6TTTTAATA 
TGGCAACCAG ACCATTGTCT CAGAGCA6GT 
CTAGCCTCTG GTAACCTCTT ACTTATTATC 
GATGCAACAT CCTTGTCTTT TTATGACAGG 
GCAC6TGGTA AAACACTT6C GGATATTCTG 
AAAATCATAT AATCTTACAA TGAAAAG6AC 
CCAACCATAC AAAAATTCCT TTTCCOGAAG 
TCTAAGATCT AACAAGATAG CCACCGA6AT 
AGTrTTATTG TCaSTTTACT TGTTTCAGAG 
TCTCCCATGA AGAAAGG6AA CGGTGAA6TA 
TAGTGGAAGC ATGATTGGTG CCCAGTTAGC 
GGAGGTTCAG TGAATTGTGT MaGAGAGGTT 
CTTTCCX»AA TTGAATCACT GCTCACACTG 
TCCCACCOGA ACGTCTTATC TAATCATGAA 
AAAAATCTAA GTGTTTCATA AATTTGA6AG 
GTAGACAGTA TATAACTAAC AACCAAAGAC 
TCATTTATCA TATATATACA TACATGCATA 
AAACAGTATT 6ACTTGTATA CCTTQTAATT 
TATCAATAAA TAGACCATTA ATCAG 

Seq ID KOt 403 Protein sequence 
Protein Accession #: NP 002407 



1 U 21 31 41 SI 

I I I 1 I I 

MKKSGVIiFLL GZILLVIiIGV QGTPWRKGR CSCISTHQGT IHLQSliRDLK QFAPSPSCBK 
lEIIATLKNG VQTCLNPDSA DVKSLIXKHB KQV8QKKKQK NGXKKQKKKV UCVRKSQRSR 

QKKTT 

Seq ID KOi 404 DNA sequence 

Nucleic Acid Accession ftt 1IM_006670 

Coding sequence! 85.. 1347 ~ 



CAGTATATTT 
ATTTTTTAAT 
GCTGGCTCTT 
TTCAG6ACAC 
ATGTTTGCTC 
GACTGTTTTT 
TTTATAOATC 
GAAAAGGGCT 
CCTTATCGAA 
TTTGTATTGT 
CTAAGCGCTA 
CTCTGCAGGA 
GTCT6T6GGC 
CTGATGATTT 
ACTCCCTAGT 
TCTGT6ACCC 
TACATATTOT 
CACTCTCAAA 
TQAAATATTT 



CCCTGTTTCT 
TQATGTTGTT 
TCCTGGCTAC 
TCACTACAGG 
AGCTTCTCCA 
AAAAAATATA 
AGCCAGTOAC 
TTCTCAATAA 
ACTCATTTTA 
GATTATCAAT 
GA6GAAGCAG 
TGTGGAAAOC 
AGAATTTAAA 
AGAGT6CTGT 
TCCTTCATGT 
ACTTACCTTG 
CACTGACACA 
GCAAATAATT 
TCTTTCTTAA 



TCCACAGTGC 
ATGGOCAGGA 
TCCATOTTGG 
GACCAGGGAT 
ACAATAAGAA 
CAGTTTACCG 
CAACCTTTTC 
GCCTCAGCTT 
GGCAAATAT6 
TACCACACCA 
CCAAGTCGGT 
TCCTTCCAGG 
CCTATACTCA 
CCGGTGGAGA 
AACTTCCCTG 
CATCTCACAG 
CACOTTATAA 
TTTCACTTCA 
AATAGAATGG 



PCTAJS02/12476 



CCGGCTCGCG 
AGCTCCGGGG 
GA06GGCGTC 
TCTGCCAGCT 
TCOGCCCAGC 
CXKIACAGTCA 
GTGCQCAACC 
CGCCGGCCGC 
GTGCG06CX3G 
CX2ACTGGG00 
AGTCCCCTTG 
0GGAGCTTCX3 
CGCC6CTTGG 
CTGCXICAGCC 
TCCTTCCGCA 
CTTCACAATG 
AACAATCCCT 
QAGGTAGTGC 
GTCCTCTTGG 
CAAACCTCTT 
GTTTTGTATT 
AGGGATCACA 
AACCTCAGTT 
CATGAGATGT 
TAGATACAAC 
TTTCTCGGT6 

TTcrrrTTCT 

TGGGCTTCTT 
ACAGATAGCA 
TATCAGTTTT 
CTGCAGACGT 
AGAGCATQCT 
TTCTTTGACA 
TTTTAATAAA 
ATTCTTAAAA 



11 
I 

CCCTCCGGGC 
AAACGCGAGC 
TGCGGCTGGC 
CCTCX3GCATC 
OCCCQCTQCC 
AGTGCGTTAA 
TCTTCCTTAC 
GGCTGGGGGA 
GCGCCTTCGA 
ACCTCAGTCC 
TGGAACTGAT 
AGGGCATGGT 
AGCTQGCCAQ 
TCAGGCACCT 
ACCTQACACA 
GCAOCCTGGC 
GGGTCT6CGA 
AGGGCAAAGA 
AACTCAACAG 
ATGTCTTCCT 
TGAACCGCAA 
TGGAAGG6TA 
CTAACTCGQA- 
AGACTTAAGC 
GGACTTTGAC 
TGTTCTGTTA 
TGGAACTCCT 
GCTGTCT6TC 
TTCAACAAAA 
ATTCTCATGT 
TAGCAG6CTC 
TACATTTTAC 
AAGTAAATTA 
CTGCATCGAG 
GAA 



21 

I 

CCAGCCTCX:C 
CGCGATGCCT 
GOGACTAGCG 
CTCCTTCTCC 
GGACCAOTGC 
CCX3CAATCTG 
CGGCAACCAG 
GCTGGCCGCX3 
GCATCTGCCC 
CTTCGCTTTC 
CCTGAACCAC 
GGTGGCGGCC 
CAACCACTTC 
6GACTTAAGT 
TCTAGAAAGC 
TGAGTIGCAA 
CTGCCACATG 
CCGGCTCACC 
TCCTGACCTG 
GGGTATTGTT 
6GGGATAAAA 
TCATTACAGA 
TGTCTGAGAA 
TTTATCCCTA 
TAAAAGCAGT 
ATGTAAGACXS 
CAACAGGTAT 
TCTCTCTCAG 
GCTGCCrCAA 
ACCTAAGTTG 
TTCAAAATAA 
TOTTCTQCAT 
CTTTTTTGAT 
ATCCAAC06A 



31 

I 

QAGCCTTOSG 
GGGGGGTGCT 
CTGGTACTCC 
TCCTGQGOQC 
CCQ6CGGT6T 
ACC6AGGTGC 
CTGGCCGTGC 
CTCAACCTCA 
AOCCTGGGCC 
TCGGGCAGCA 
AT06TGCCCC 
CTGCTGGCGG 
CTTTACCTGC 
AATAATT06C 
CTCCACCTGG 
GGTCTACCCC 
GCAGACATGG 
TGTGCATATC 
GACTGTGACC 
TTAGCCCTGA 
AAGTGGAT6C 
TATGAAATCA 
ATATTA6AGG 
CTAGGCTTGC 
GAAGGGGATT 
ATGAACA6TT 
GGAGGGATTT 
TACAGTTCAA 
CTTTTTCGAG 
TGGAGAAAAT 
CTCCATGGTG 
ATTACAAAAA 
TGCAGTTTAT 
CTQAATTOTT 



41 
I 

AGCGGG06CC 
CCCGGGGCCC 
7GGGCTGGGT 
CX3TTCCTG6C 
6G6AGTGCTC 
CCACGGACCT 
TCCCTGCCGG 
GOGGCAGCCG 
AGCrCGACCr 
AT6CCAGCGT 
CTGAAGATGA 
GC06TGCACT 
CGGGGGATGT 
TGGIGAGCCT 
AGGACAATGC 
ACATTAGG6T 
TGACCTGGCT 
C6GAAAAAAT 
CGATTCTTCC 
TAGGC6CTAT 
ATAACATCAG 
ATGCX3GACCX: 
ACAGACCAAG 
TCCACTTTCA 
TGCTTCCTTG 
GTGTATAGTG 
TTCAGGTTTC 
GGTGTAGCAA 
AAAAATACTT 
AATTGCATCC 
CACAGGAGCA 
ATAACTTGCA 
AT6AAAATGT 
AAAAAAAAAA 



51 
I 

GTCCCAGCCC 
C6CCGCCGGG 
CTCCTOGTCT 
TTC0GCC6TG 
CGAGGCAGG6 
GCCCGCCTAC 
CGCCTTCGCC 
CCTGGACGAO 
CAGCCACAAC 
CTCGGCCCCC 
GCGGCAGAAC 
GCAGGGGCTC 
GCTGGCCCAA 
GACCTAG6TG 
CCTCAAGGTC 
TTTCCTGGAC 
CAA6GAAACA 
GAGGAATCGG 
CCCATCCCT6 
TTTCCTCCTG 
AGATGCXTTGC 
CAGATTAACA 
GACAACTCTG 
TCCTCCACTA 
TTATGTAAAG 
TTTTAOCCTC 
A6CAT0AACA 
GTGTACCCAC 
TATTCATAAA 
TATAAACTGC 
CCT6CATCCA 
ACTTCATAAC 
ACTGATTTTT 
AAAAATAAAG 



Seq ID MOt 405 Protein sequence 
Protein Accession #: NP 006661 



1 

I 

MPGGCSRGPA 
QCPALCBC8B 
AAI(NLSGSRL 
NKIVPPEDER 
IiSNNSltVSLT 
EMADKVTHLK 
IVUVLIGAIF 



11 
I 

AGDGRLRUVR 
AARTVKCVNR 
DEVRAGAFEH 
ONRSFEGMW 
YVSFRNLTHL 
ETEWQGKDR 
LIiVLYLNRKG 



21 
I 

LALVIiLGWVS 
NLTEVPTDIiP 
LPSLRQLDLS 
AALLAGRALQ 
ESLHLEDNAL 
LTCAYPBXNR 
IKKWMHNIRD 



31 

1 

SSSPTSSASS 
AYVRNLFLTG 
HNFLADLSPF 
GIiRRLELASN 
KVLHNGTLAS 
NRVLLELNSA 
ACRDHHESGYB 



41 

1 

FSS8APFLAS 
NQIiAVLPAGA 
AFSGSHASVS 
HFLYLPRDVL 
LQGLPHIRVF 
DLDCDPILPP 
YRYBINADPR 



51 

1 

AVSAQPPLPD 
FARRPPLAEL 

APSPLVELIL 
AQLPSLRHLD 
LDNNPWVCDC 
SLQTSWFLG 
LTNLSSNSDV 



1440 
1500 
1560 
1620 
16B0 
1740 
IBOO 
1B60 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 



60 
120 



60 
120 
ISO 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 



60 
120 

180 
240 
300 
360 



Seq ID NO: 406 DNA sequence 
Nucleic Acid Accession #« B 



335 



wo 02/086443 
Coding sequencer 1..927 

1 11 21 31 41 51 

I 1 I I I ) 

ATGCCTGGGG GGTGCTCCCG GGGCCCCGCC GCCGGGGACG GGCGTCT60G GCTGGGQC6A 60 

CTA6CQCTGG TACTCCTGGG CTGGGTCTCC TCGTCTTCTC CCACCTCCTC GGCATCCTCC 120 

TTCTCXrrCCT CQGG6C0GTT CCTGGCTTCC GCOGTGTCCG CCCAGOCCXX: GCTGCGGGAC 180 

CAOTOCCCCO OGCTGTGOGA GTGCTCCGAG GCAGCGOGCA CAGTCAAQTG CGTTAACCGC 240 

AATCTGACCG AGGTGCXXaC GGACCTGCCC GCCTAOQTGC GCAACCTCTT CCTTACCGGC 300 

AACCAGCTGG CCAGCAACCA CTTCCTTTAC CTGCCQOGGG ATGTGCTGGC CCAAC TGCC C 360 

AGCCTCAGGC ACCTGGACTT AAGTAATAAT TCGCTGGTGA GCCTGACCTA COTGTCCTTC 420 

CGCRACCTGA CACATCTASA AAGCCTCCAC CTGGAGGACA ATGCCCTCAA GGTCCTTCAC 480 

AATGGCACCC TGGCTGAGTT 6CAAGGTCTA CCCCACATTA QGGTTTTCCT GGACAACAAT 540 

CCCTGGGTCT OOGACTGCCA CATQGCAGAC ATGGTGACCT GGCTCAAGGA AACAGAGGTA 600 

GTGCAGGGCA AAGACCGGCT CACCTGTGCA TATCOGGAAA AAATGAGGAA TOGGGTCCTC 660 

TTGGAACTCA ACAGTGCTQA CCTGGACTGT GACCCGATTC TTCCCCCATC CCTGCAAACC 720 

TCTTATGTCT TCCTGGGTAT TGTTTTAGCC CTGATAGGCG CTATTTTCCT CCTGGTTTT6 780 

TATTTGAACC GCAAGGGGAT AAAAAAGTGG ATGCATAACA TCAGAGATGC CTGCAQGQAT 840 

CACATGGAA6 GGTATCATTA CA6ATATGAA ATCAATGCQ6 ACCCCAQATT AACAAACCTC 900 
AGTTCTAACT GG6ATGTCCT CGAGTGA 



Seq ID NO: 407 Protein sequence 
Protein Accession #t Bos sequence 

1 11 21 31 41 51 

I I i 1 ) ( 

MP6GCSR6PA AGD6RLRLAR LALVLLGWVS SSSPTSSAfiS FSBSAPFLAS AVSAQPPLPD 60 

QCPALCECSE AARTVKCVNR NLTBVPTDLP AYVRNLFLTG NQLASEmPLY LPRDVIiAQLP 120 

SIiHBIiDIiSIlN SIiVSLTYVSF RNLTHLESLH LEDNALKVia NGTIjABLQGL PHiaVFLDNN ISO 

PHVCDCHMAD MVTWLKETEV VQ6RDIILTCA YPBRMRHRVL LBLNSADLDC OPXLPPSLQT 240 

SYVFLGIVIiA LIGAIFLIiVli YUIRXGZKKN NHNIRDACHD HHECnrByKYE HIADPRLINL 300 

SSKSDVLE 



Seq ID NO: 408 DNA sequence 

nucleic Acid Accession KM_00009S.l 

Coding sequence: 2 6.. 22 99 

1 II 21 31 41 51 

1 I I I I 1 

CAOCACCCAfi CTCCCC6CCA CC6GCATGGT C0C08ACACC GCCTGGGTTC TTCTGCTCAC 60 

CCTGGCTGCC CTCGGCJQCGT CCQGACAGGG CCAGAGCOGO TTGGGCTCAO ACCTGGGCOC 120 

GCAGATGCTT OGGGAACTGC AGGAAACC3UI CGCGGOGCTG CAGGAOGTGC GGGACTGGCT 180 

GOSQCAGCAG GTCAGGGAGA TCACGTTCCT GAAAAACACG GTGATGQAOT GTGACGCGTG 240 

OGGGATGCAG CAGTCAGTAC GCACCGGCCT ACCCAGCGTG CGGCCCCTGC TCCACT6CGC 300 

QOCOOGCTTC TGCTTCOCCO GG8TGGGCTG CATCCAOACG QAGAGOGGOO GCOGCTGCGG 360 

CCCCTGCCCC 0C0G6CTTCA CGGGCAAOGO CTOGCACTGC ACCQACOTCA AOGAOTGCAA 420 

CGCCCACCCC TGCTTCCCCC GAGTCC6CTG TATCAACACC AGCCCGGGGT TCOGCTGCGA 480 

GGCTTGCCCG COGGGGTACA GCGGCCCCAC CCACCAGGGC GTGGGGCTGG CTTTCGCCAA 540 

GGCCAACAAG CAGGTTTGCA CGGACATCAA CQAQTGTGAG ACCGGGCAAC ATAACTGCGT 600 

CCCCAACTCC GTGT6CATCA ACACCOQG6G CTCCTTCCAQ TGCGGCCOGT GCCAGCCCGQ 660 

CTTCGTGGGC GACCAGOOOT CCQ6CTGCCR GOGCGGCGCA CA6CGCTTCT GCCCCGACGQ 720 

CTC6CCCAGC GAGTGCCACG AGCATGCAGA CTGCGTCCTA GAGCGCGATG GCTGGCGGTC 780 

GTGCQTQTGT CGCGTTGGCT GGGCCGGCAA CGGGATCCTC TQTGGTCGOQ ACACTQACCT 840 

AGAOGGCTTC GOGGACGAGA AGCTGCGCTG CCCGGAGCCG CAGTGCCGTA AGGACAACTG 900 

CQIGACIGTG CCCAACTCAG GGCAGGAGGA TGTGGACOGC GATGGCATOG QAGACGCCTG 960 

GGATCCGGAT GC06ACGGGG AOGGGGTCCC CAATGAAAAG GACAACTGCC OGCTGGTGCG 1020 

GAACCCAGAC CAGC6CAACA CG6ACGAGGA CAAGTGGGGC GAT60GTQ0Q ACAACTGCCG 1080 

GTCCCAGAAG AACGACGACC AAAAGGACAC AGACCAGGAC GGCCGGGGOG AT606TGCGA 1140 

CGACGACATC GACGGCGACC GGATCC3GCAA CCAGGCCGAC AACTGCCCTA GGGTACCCAA 1200 

CTCAGACCAG AAGGACAGTG ATGGC3QATGG TATAGGGGAT GCCTGTGACA ACTGTCCCCA 1260 

GAA6AGCAAC COGGATCAGQ CG6ATQTGQA CCAC6ACTTT GTGGGAGATG CTTGTGACAG 1320 

CGATCAA6AC CAGGATGGA6 ACGGACATCA GGACTCTCGG GACAACTGTC CCACGGTGCC 1380 

TAACA6TGCC CAGGAGGACT CAOACCACXSA TGGCCAGGQT GATGCCTGC6 AGGACGACQA 1440 

CGACAATGAC G6AGTCCCT6 ACA6TCGGGA CAACTGCCGC CTGGT6CCTA ACCC06GCCA 1500 

GGAGGACGOQ GACAGGGACQ GCGTGGGCGA CGTGTGCCAG QAOGACTTTG ATGCAGACAA 1560 

GOTQQTAGAC AAGATOGACO TGTGTCCGGA GAACGCTGAA GTCACGCTCA CCQACTTCAG 1620 

GGCCTTCCAG ACAGTCGTGC T6GACC0GGA GGGT6A06C6 CAGATTGACC OCAACTGGQT 1680 

GGT6CTCAAC CAGG6AAGG6 AGATCGTGGA QACAAT6AAC AGCGACCCAO GCCTG6CTOT 1740 

GGGTTACACT GCCTTCAATG GCGTGGACTT C6AGG0CACQ TTCCATGTQA ACAGGOTCAC 1800 

GGATGAOGAC TATGCGGGCT TCATCTTTGG CTACCAQGAC AGCTCCAGCT TCTAC3GTGGT 1860 

CATGTGGAAG CAGATGGAGC AAACGTATTG GCAGGCGAAC CCCTTC06TG CTGTGGCCGA 1920 

GCCTGGCATC CAACTCAAGG CTGTQAAGTC TTCCACAG6C CCCGGGGAAC AGCTGCGGAA 1980 

C6CTCTGTG6 CATACAQQAG ACACAQAGTC CCAGGTG0Q6 CTGCTGTGGA AGGACCOGOG 2040 

AAACQTGGGT TGGAAGGACA AGAAGTCCTA TOGTTGGTTC CTGCAGCACC GGCCCCAAGT 2100 

GGGCTACATC AGGGTGCX3AT TCTATGAQGQ CCCTGAGCTG GTGGCCGACA GCAACGTGGT 2160 

CTTGGACACA ACCATGOGGG GTGGCCGCCT GGGGGTCTTC TGCTTCTCCC AGGAGAACAT 2220 

CATCTGGGCC AACCTGGGTT ACOGCTGCAA TGACACCATC CCAGAGGACT ATGAGACCCA 2280 

TCA6CT6CG6 CAAGCCTAGG GACCftOGOTG AGGACCCQOC GGATGACAGC CACCCTCACC 2340 

6CGGCTGQAT G6GGGCTCTG CACCCAGCCC AA6GGGTGGC 06T0CTGAG6 OGGAAOTGAG 2400 
AAGGGCTCAG AGAGGACAAA ATAAAGTGTG TGTGCAGGG 



Seq ID NO: 409 Protein sequence 
Protein Accession «: NP_000086.1 

1 11 21 31 41 51 

I 1 11-1 1 

MVPOTACVIiIi LTliAALGASG QGQSPLGSDL GPQNLRELQB TNAALQDVRO HIiRQQVRBIT 60 



336 
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WO 02/086443 

FLKKTVMECD ACX3KQQSVRT GLPSVRPLIjH CAPQFCFPGV 
NQSHCTDVNE CKAHPCFPRV RCIKTSPGFR CEACPPGYSO 
INECETGQHN CVPNSVCINT RGSPQCGPCQ PGPVGDQASG 
ADCVLERDGS RSCVCRVGWA GNGILCGRDT'DLDGFPDEKIi 
EDVDRDGIGD ACDPDADGDG VFNEKDNCPL VRHPDQR2)TD 
DTDQDGRQDA CDDDIDGDRI RNQADNCPRV PNSDQKDSDG 
VDHDPVGDAC DSDQDQDGDG HQDSRDNCPT VPNSAQEDSD 
RDNCRLVPNP QQEDADRDGV GDVCQDDFDA DKWDKIDVC 
PE^AQIDPN WWINQGREI VQTMNSDPGL AVGYTAFNGV 
FGyQDSSSFY WKHKQMEQT YHQANPFSAV ABPGIQLKAV 
BSQVRLLHKD PBNVGWKDKK SYRHFLQHRP OVGYIRVRFY 
RLGVFCFSQE NIIWANLRYR CNDTIPEDYB THQLRQA 

Seq ID NOt 410 OHA sequence 

Nucleic Acid Accession #t MM_001S65.1 

Coding sequence t 67.. 3 63 " 



ACZQTESGGR 
PTHQOVGLAF 
CQSGAQRFCP 
RCPEPQCRKD 
EDKWGDACDN 
DGIGQACDNC 
HDGQ6DACDD 
PENAEVTLTD 
DFE6TPHVNT 
KSSTGPGEOL 
E6PELVAD5N 



C6PCPAGFTG 
AKAHKQVCTD 



NCVTVPNSGQ 
CRSQmiDDQK 
PQKSHPDQAD 
OSDMDGVPDS 
FRAFQTWLD 
VTDDDYAGPI 
RNALWHTGDT 
WIiDTTMRGG 



QAGACATTCC 
AGCACCATGA 
ATTCAAGGAG 
CCTGTTAATC 
CGTGTTGAGA 
TOQAAGGCCA 
TAAAACCAGA 
CCTCTCCCAT 
GTTACACTAA 
GGTTAATQTT 
GCTCTACTQA 
ACCTTTCCCA 
TCAGAATCTC 
ACTTCATGQA 
CATACAATTC 
CTTATTTAAT 
TTTCAGTGTA 
.TTTTAAAAAT 
TTTTCaiAATA 



11 

I . 

TCAATTGCTT 
ATCAAACT6C 
TACCTCTCTC 
CAAGGTCTTT 
TCATTGCTAC 
TCAAQAATTT 
GGGGAGCAAA 
CACTTCXXTTA 
AAGGTGACCA 
CATCATCCTA 
GGTGCTATQT 
TCTTCCAAGG 
AAATAACTAA 
CTTCCACTGC 
CAAACACATA 
GAAAGACTGT 
CATGGAATAA 
ACAOATA6AT 
AAAAT6AG6T 



21 
I 

AGACATATTC 
GATTCTGATT 
TAGAACCGTA 
AGAAAAACTT 
AATGAAAAAG 
ACT6AAA6CA 
AT06ATGCAG 
CATGGAGTAT 
ATGATGGTCA 
ASCTATTCAG 
TCTTAGTGGtA 
GTACTAAOGA 

aaggtatgca 
catcx:tccca 
caggaaggta 
acaaagtata 
catgtaatta 

ATATGCTCTG 
ACTCTCCXQG 



31 

I 

TGAGCCTACA 
TGCTGCCTTA 
CGCTGTACCT 
GAAATTATTC 
AAGGGTGAGA 
GTTAGCAAGG 
TGCTTCCAAG 
ATGTCAAGCC 
CCAAATCAGC 
TAATA ACTCT 
TGTTCTGACC 
ATCTTTCTGC 
ATCAAATCTG 
AGGGGCCCAA 
GAAATATCTG 
AGTCTTAGAT 
AGTACTATGT 
CAT6TTACAT 
AAATATTAA6 



41 
I 

GCAGAGGAAC 
TCTTTCTGAC 
GCATCAGCAT 
CTGCAAGCCA 
A3AGATGTCT 
AAAT6TCTAA 

gatggaccm: 

ATAATTGTTC 
TGCTACTACT 
ACCCTGGCAC 
CTGCTTCAMV 
mGGGGTTT 
CTTTTTAAAG 
ATTCTTTCAG 
AAAATGTATG 
GTATATATTT 
ATCAATGAGT 
AAGATAAATG 



51 
1 

CTCCAGTCrC 
TCTAAGTOGC 
TAGTAATCAA 
ATTTT6TCCA 
GAATCCAGAA 
AAGATCTOCT 
ACAGAGGCT6 
TTAGTTTGCA 

cctgtaggaa 
tataatgtaa 
tatttccx!:tc 
atcagaattc 

AATGCTCTTT 
TGGCTACCTA 
TGTAA6TATT 
CCCATATTGT 
AACAGGAAAA 
TGCTGAAI06 



5eq ID KOt 411 Protein sequence 
Protein Accession St NP_0015S6.1 

1 11 21 31 41 51 

( I I 1 I I 

NtlQTAILICC LIFLTLS6IQ GVPLSRTVRC TCISI8NQPV NPRSLBXLEI IPASQFCPRV 
EZIATMKKRB EKRCUIPBSK AIXNZiLKAVS KENSKRSP 

Seq ID NO: 412 DNA sequence 
nucleic Acid Accession #t XN_057014 
Coding sequence: 143.. B74 ^ 



GQGAGGGAGA 

C6C6G06GAG 
CGCTGCCC6G 
CC6CGGCCTC 
GCCCAAGG66 
AAT6TGCTTA 
CATTCCGGGT 
TCTGAGGGAA 
ATTGAATTAT 
AAATAGTGCT 
CTGTCAGCGT 
AGCTATAATT 
CACTTCTTCT 
CTQGGTTGGC 
TTCTOGCATC 
TTTTTTTATT 
CATCTGAATG 
TTTAAATCIA 
TGGTTAGAAT 
GGTCTTTTGT 
TGTACAATTT 
CAACCTTAAA 



11 
1 

GA6G03G8CG 
CCAGACGCTG 
CAGCaSGGAG 
CTGCTGCTCC 
AAGCAAAA6G 
CAAGQGGCAG 
ACACCTGGGA 
AGCTTTGAGG 
GGCATAGATC 
CTAAGAQTTT 
TGGTATTTCA 
TATTTGGACC 
G TGQAA GGAC 
ACTTGTTCAG 
ATTATTGAAG 
ATGCCTTGGA 
AAAAGCAAAG 
GCATTATTCA 
ACTTTCTTCA 
TTTTTCTCTT 
GTAAATGTTA 
AAAAAAAAAA 



21 
I 

GGTGAAAGGC 
ACCACGTTCC 
CCATGCGACC 
TGCTGCTGCA 
06CAGCTC06 
CAG6AGTGCC 
TCCCAGGTCG 
AGTCCTGGAC 
TTGGGAAAAT 
TGTTCAGTGG 
CATTCAATGG 
AAGGAAGCCC 
TTTGTGAAGG 
ATTACCCAAA 
AACTACCAAA 
ATGGTTCACT 
CTAAATATGT 
TTTTGCTTCA 
TAGTCACATT 
AGTATA6CAT 
AQAATTTTTT 



31 
I 

GCATtGATGC 

CCAGGGCCCC 
GCTGCCCGCG 
GCAGA666A6 
TGOTCGAQAC 
GGATGGATTC 
ACCCAACTAC 
TGCGGAGTQT 
CTCACTTCGG 
AGCTGAATGT 
TGAAATGAAT 
AATTGGTGCT 
AGGA6ATGCT 
ATAAATGCTT 
TAAATGACAT 
TTACAGACCA 
ATCAAAAOTG 
CTCTCAACCT 
TTTTAAAAAA 
TTATATCTQT 



41 

I 

AGCCTG0G6C 
CTCCTCCGCC 
GCCGCCTCCC 
CCGTOGAGCQ 
6T6GTGGACC 
GOGAGOCCTG 
AAAGGA6AAA 
AAGCAGTGTT 
ACATTTACAA 
CTAAAATGCA 
TCAGGACCTC 
TCAACAATTA 
GGATTAGTGG 
TCTACTGGAT 
TAATTTTCAT 
TTTAAATAAG 
AAGTGTGATT 
GTTTCAATAT 
ATAATTTG6A 
ATATAAAAGC 
TAAATAAAAA 



51 
i 

66CCT0GGA6 

TCCAGCTCOQ 

CGCAiGOGGCr 

CCTCTGAGAT 

TGTATAAT6G 

GGGCCAATSG 

AG6GGGAAT6' 

CATGGAGTTC 

AGATGCGTTC 

GAAATGCAT6 

TTCCCATTGA 

ATATTCATOG 

ATGTTGCTAT 

GGAATTCAGT 

TTGCTACCTC 

TTTATGTATA 

TCACACTGTT 

TTTTTTTAGT 

ATATTGTTGT 

TACCAATCTT 

TTATTTCCAA 



Seq ID NO I 413 Protein sequence 
Protein Accession #: XF_057014 



41 



51 



1 11 21 31 

1111.. 
MRPQGPAA8P QRLRGLLLLL IiLQIiPAPSSA SEIPKGKQKA QLRQREWDL yNGMCLQGPA 
GVPGRDGSPG ANGIPGTPGI FGRDGFKGEK GECLRESFEE SWTPNYKQCS WSSLNYGIDL 
GKIAECTFTK MRSNSALRVL FSGSLRLKCR NACCQRWYPT FNGAECSGPL PIEAIIYLDQ 
GSPEMNSTIN IHRTSSVBGL CEOIGAGLVD VAIHVGTCSD YPKGDASTGW NSVSRIIIEE 
LPK 



120 
180 
240 
300 
360 
420 
4B0 
540 
600 
660 
720 



PCT/US02/12476 



60 

120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
.1080 



60 
120 
IBO 
240 
300 
360 
^420. 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 



60 

120 
180 
240 
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wo 02/086443 
Seq IE HOt 414 DMA Baquene* 
Hueleie Acid Accession t> XM_0S4007 
Coding sequsnesi 13 8.. 3403 
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1 
I 

CT0GTGC05A 
CCAGTGC3GCC 
QCGGAGACGA 
TCTCTGTCAC 
AAATTAiGTCC 
ATCATCTACa 
TCA6AAAATT 
ACCAOGACCA 
AOCATCACTC 
CTGCTTCTGQ 
6TAAAGATCC 
GAAGGAATGT 
TCTCTGAAGG 
CCAAAiQATQT 
TGGCTGGTAG 
6AAACACAAA 
GCATGGGCAT 
TCAACCAAAT 
CTGCAAA6AC 
TCA6TTTCCT 
AATTTCTCCT 
TACACCTTCT- 
CAATGGAAAT 
OTOCCTATTT 
TTCTTGTTGA 
AC3AAC3AAACC 
CTCAACTTTC 
QAGCAGACTC 
AAGAGGTCAT 
GGT6CAAGAA 
TTCACCACCA 
CTCACAGTCA 
TGGCCTGOAT 
GTGCTGCTTT 
ATGACTTGCC 
W3CAGGCTGT 
6AATTTTCAT 
GCTTTATTCAT 
QTGACCATG6 
GTTTTGGAAT 
TCTAGTTAAG 
AGGGAGATGA 
TTGTATTGAA 
TATTCTATCT 
TAAACAAGAG 
TTTTCAA6AA 
TQTTTAGaAA 
AGCAAAGAAA 
AAAAATCACA 
CAOAATTAGT 
GTA6TGAGCA 
AAATATATTT 
TT06TG0Q6G 
TATTGCCAAG 
CAAAATTATC 
TCATTTBATT 
GA6CAATTGT 
6ATGTTTCTT 



11 

1 

ATTCGGCACX3 
CGTGTGGAAC 
AG6CGCAATG 

AAATCcxxrrr 

GAATTG6GAA 
ACAGCTTTTC 
ACTTCAAAAT 
TCACTCAGAC 
A0ACCA06A6 
TAAAAATAAG 
TAQAAACAGC 
CAAGGACAGT 
AACTCACTTT 
AAGCAGCTCC 
GAAAAGAAA7 
TGAAAATCCT 
OCAtSGTTCCG 
T6ATGCTAGA 
CTATTCATTA 
GTCTCTGCTO 
GAGTTTCCTT 
TCCACATTCT 
OAAAAGAGGA 
TGATTCCAOG 
ACATQTCCTC 
TQAAAATGAT 
AACAAATGAG 
ACAAGA6CCC 
GATAGCTCAT 
TAAATGCC3^T 
TCATGACTAC 
CA6CCA6CX3C 
GGTGATAATG 
TACTGAAG6C 
TCATGAATTA 
CCTTTATAAT 
TGGTCATTAT 
GTATGTTGCT 
ATGTAGCOGC 
TATGTTACTT 
6TTTAAATGC 
GTTTGTATGC 
TATTGCTGTC 
T6GAGATAAA 
ATTTGGCATG 
CTAACACAGT 
TAAGAATOTG 
TAAAGGAGAA 
AAATTTGTTG 
ATA6AGTACA 
CTCTCATATA 
AATOAATTCA 
TTATATACCA 
TTATATATCA 
AGAGTAGTAA 
CGATTCAGAA 
CTTTATATAC 
TTTTACACAA 



21 
I 

AGACCGCGTG 
CAAACCTGCG 
GCQAGGAA6T 
CATGAACTAA 
TCTGGCATTA 
TACXX5CTATG 
ATAGGCATAG 
CACX3AGCATC 
CATCACICTO 
G8AAAAI5CTC 
CAGGGGAAAG 
GTTAGTGCTA 
CTAGAGACAA 
ACTOCACCCA 
GAATCTOTGA 
CAG6ASTGTT 
CTGAATGCAA 
TCTTGTCTQA 
C3VAATAGCCT 
OOQOTTATCT 
GTGGCACTGG 
CATGCAAGTC 
CCACTTTTCA 
TQGAAGGGTC 
ACATTGATCA 
GATGATGTGG 
GAGAA AGTA G 
TCXXACTTTG 
GCTCATCCAC 
TCACATTTCC 
CATCATATTC 
TACTCTOQGO 
GOTQATGSCC 
TTATCAAOTQ 
6GTGACTTTG 
0CATT6TCA6 
GCT6AAAATG 
CTGGTTGATA 
TGGGGGTATT 
ATTTCCATAT 
TAGAGTAGCT 
TOTACTATGC 
TGTTACAAAG 
ATCTGTATQT 
ACATQTTCTG 
TATTCCTATA 
GATQAAGCCT 
AAGASAAGAA 
TAAATTAGAG 
TTCATTAAAC 
CTAATTAGTG 
AGCAATATAC 
GATGAGTACA 
CCAAAAGCTG 
AACTTTGATA 
AGTACTTT6A 
GGTACTGTAG 
TAAATTCCTT 



31 
i 

TTCG06CCTG 
OGCGTGGCOQ 
TATCTGTAAT 
AAGCAGCTGC 
ATGTTSACTT 
GAGAAAATAA 
ATAAGATTAA 
ACTCAGACCA 
ACCATGATCA 
TTTGCCCAGA 
GAGCTCAC06 
GTGAAGTGAC 
TAGAGACTCC 
OTGTCACATC 
GTGAGCCCOO 
TCAATGCATC 
CAGAGTTCAA 
TTCATAC3UU3 
GGGTTGGTGG 
TAGTGCCTCT 
CXX3TTGGGAC 
ACCACCATAG 
GTCATCTGTC 
TAACAGCTCT 
AACAATTTAA 
AGATTAABAA 
ATACAGATGA 
ATTCTCAGCA 
AGGAAGTCTA 
AOGATACACT 
TCCATCATCA 
AGGAGCTGAA 
TQCACAATTT 
GTTTAAGTAC 
CTGTTCTACT 
CCAT6CTGGC 
TrrCTATGTG 
TGGTACCTGA 
TCTTTTTACA 
TTGAACATAA 
TAAAAAGTTG 
AGCGXTTAAA 
TCAGTTAAAG 
GCAATTCACC 
TATQTTTCAG 
CTGGATTTTA 
AAAATACCAA 
TCTGAGAATT 
GGGAGAAATT 
ATTTTTGTCA 
TACATTTAAC 
ACTTGACCAA 
GTQAGTA6TT 
TATGACTGGA 
TATATGAGQA 
TATCTCTCAS 
OCATACTAGG 
ATATCA6CTT 




AAGAATCCAT 
TGAGCX3TCAC 
TCACTCCCAC 
GCATGACrCA 
ACCAGAACAT 
CTCAACTGTG 
AAGACCTGQA 
AAAGAGCX^GG 
AAAAQGCTTT 
AAAGCTACTG 
CTATCTCTGT 
TGAAAAGAA6 
TTTTATAGCC 
CATGAATCX3G 
TTTQAGTGGT 
TCATAGCCAT 
TTCTCAAAAC 
AGGAGGCCTG 
AGATAAGAA6 
6CABTTGTCC 
TOQAACTOAA 
GCCTGCAarC 
CAATGAATAT 
OSGCCAGTCA 
CCACCACCAA 
AOATGOOGGC 
CA6CGATG6C 

AAAGGCTGGC 
GTATCTTGQA 
GATATTTGCA 
AATGCTGCAC 
GAATGCTGGG 
AATCGTGTTT 
TCATAGTTTC 
QTTACTGGGT 
GTAOGTrTXA 
GGTATTACCA 
GGAAAAATGT 
GGTCTCTGAA 
OAAAGCTTAT 
GGG6AGGCAT 
TA6AATTAA6 
GGATTATTTC 
TTTGTATAAT 
GAAATTGQAA 
TATGTATCAC 
TGTTCTGGTT 
TATTAAAACT 
TGCTTCAGTG 

ocrrarcTGTG 

G 



.51 
I 

CTCGAAGACA 
CAA0GAG6CC 
ACCTTTGCCC 
ACCACIGAGA 
ACAOGGCAAT 
GTTGAAGGGT 
ATACACCAT6 
TCAGACCATG 
CATAATCATG 
GATAGTTCAG 
GCCAGTGGTA 
TACAACACTG 
AAACTCTTCC 
GTGAGCCGGC 
AT6TATTCCA 
ACATCTCATG 
CCAGCX3VTCA 
GCTGAAATCC 
ATTTCCATCA 
GTGTTTTTCA 
GATGCTTTTT 
GAAGAACCAG 
ATAGAA6AAA 
TATTTCATGT 
AAAAAGAATC 
AA6TATGAAT 
GQCTATTTAC 
TTGOAAQAAS 
GTACCCAGAG 
GACGATCTCA 
AACCACCATC 
6TOGCCACTT 
CTAGCAATTG 
GTGTTCTGTC 

atgaccgtta 
atg6caacag 
cttactgctg 
aatgat6cta 
atgcttttgg 

CX3TATAAATT 
A8TAGGTCAT 
TTT6T6ATTT 
ATATTTAAGT 
GTTTATTATG 
CTTTAATGCT 
GAACTGCTGG 
ACIGAATTTA 
AGATTCTTAT 
TATAAAAAGG 
CCGTAAAAAC 
ACAGAAATCT 
7TTCAAAATG 
CAGACTGGQT 
ACCTG6TTTA 
ACACTAAGTA 
CTATCATTGT 
GCATTCTCTA 
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11 


21 


31 


41 


51 
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1 

KABKLSVILI 


1 

LTPALSVTNP 


1 

LHELKAAAFP 


1 

QTTEKISPNW 


1 

ESGINVDLAI 


STRQYHLQQL 


60 


PYRYGENNSL 


SVB6FRKLLQ 


NIGIDKXKRI 


HIHHDKDHHS 


DHEHHSDHER 


HSDHEBH^H 


120 


QIESDHDKHS 


HHNHAASGKN 


KRKALCPDHD 


SDSSGKDPRN 


&QGKGAHRPB 


HASGRRNVKD 


180 


SVSASBVTST 


VYNTVSBGTH 


PIiETIETPRP 


GKLFPKDVSS 


STPPSVTSKS 


RVSRLAGRKT 


240 


KESVSEPRKG 


PMYSRNTNEN 


PQSCFI7ASKL 


LTSHGMGIQV 


PLNATEFNYL 


CPAIINQIOA 


300 


R8CLIHTSHK 


KAEIPPKTYS 


LQIANVGGFI 


AISZZSFLSL 


LGVILVPLMN 


RVFFKPLLSF 


360 


LVALAVGTI.S 


GDAPLHLIiPH 


SHASHBRSBS 


HEEPAHEMKR 


GPLPSKLSSQ 


NIEESAYFDS 


420 


TWKGLTALGG 


LYFMFLVBHV 


LTLIKQFKDK 


KKKNOKKPEN 


DDDVEIKKQL 


SKYHSQLSTN 


480 


BEKVDTDDRT 


EXSYLRADSQB 


PSHFDSQQPA 


VLEEEEVMIA 


HAHPQEVYNE 


yVPRGCKNKC 


540 


HSHFHDTLGQ 


SDDLIHHBHD 


YHHITrHHHHH 


QNHHPHSHSQ 


RYSRBBCiKDA 


GVATLAHMVI 


600 


K6DGLHNFSD 


GLAIGAAFTB 


GLSSGLSTSV 


AVFCHELPHB 


LODFAVIjLKA 


GMTVKQAVLY 


660 


NALSAMLAYL 


GMATGIFIC^ 


YAaiVSMHZF 


ALTAGLPMYV 


ALVDMVPSOi 


HNDASDHGCS 


720 


RWGYPFLQHA 


GMLLGFGIMli 


LISIFEHKIV 


FRINF 
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11 

I 



5 

10 

15 

20 

25 

30 

35 

40 

45 

50 

55 

60 

65 

70 

75 

80 

85 



I 

ATGCCGAAGC 
CG6G6AGTG6 
TGCAC6TTCC 
AATTTGGGGT 
TTGGA6CTAC 
GACCTCAGCT 
CAGACOCTCC 
TTTATCCACC 
AATCTCCTCC 
AGACTCTCCA 
AGCATOCTTC 
T6C6ATTGTG 
TGTAAAAAGG 
AAGTTGTACA 
GAGTCCCCTC 
GATGGTGGCA 
AATATGACCXS 
GATGTGTACA 
GTTGCCTTGG 
ATAGCATACT 
CCCAGAGTCA 
GTGA6AGCCC 
CTQAACCGAC 
CAAACAATAT 
CCTA6TG6AG 
TGCAACGTGA 
CTGAAAGOGC 
AG6ATCAAGT 
QATGAAATGG 
GAGAAAGACA 
GCTTTAGCAA 
GATTTGGCTA 
6TCCAAQTCA 
CATTTTACGG 
AGACXSCCCAG 
GGCTCGGGCA 
GAGGTGTTCC 
AGAA6AAAGC 
GOTCXKAQAa 
GA60QCTGG6 
GTACCCCOGT 
TTTCCTGCTG 
TGCTCASCA6 
GGCAGQVTGG 
AGCACAGCTC 
ACTGAAGGAG 
TCTCCTACTC 
ACAGAGGGTT 
GAGCCTCCAT 
QATTTGGAGA 
CTTACTCCAA 
TCTACTATAG 
ATCCACCTTG 
AAAGAGATGT 
AGAAGTTCTG 
GGTATAATGA 
CTAGACAAAG 
ACCATGAGCA 
AAATTCCS3CC 
TCTACTCAAC 
6TTCCTACAG 
AATQCAOAAG 
CATOGATATA 
CCAGAAAATA 
ACTGTTTCTC 
AAAATATATT 
T CAgftT GGAA 
ATTTTAGTCA 
ACTATGGGAG 
AATCCCTCAA 
GGGGAAAATC 
GRGTTTTTGT 
AC3UUJTCrCT 
GATCAAGATC 
CACACCCCTA 
ATGTCTTTGG 
GCATCTAGAG 
GCAACCCCAG 
CCCTCTTCCG 
TTTGQTAGTA 
GCTTCrCATC 
CTACCTGAAA 
CACTGGACCA 
CAGTTTACAA 



GCGOGCACTG 
CGCT66CCTG 
GATCCCTGGC 
TTAATAGCAT 
TIATGATTCA 
CTCTTCAGGT 
AGGGTCTCTC 
CTCAAGCTTT 
ACCAGCTGCA 
CCATAAGGCA 
GGAACATGCC 
AGATGA6ATG 
ACAAAGCTTA 
AACATGAGAT 
TGAGACAGAA 
GCCAGCTCAT 
AGGAGCAGGG 
A6ATTCACTT 
ACTTTQAGTG 
ACAQTGAAGT 
GCTACCAGTA 
AGATTCTTGC 
GTCAGASTAC 
CCACCAAAGA 
CTGTGCAAAG 
AAGCTTCTQA 
CCATGGATGA 
CCATGGAGCC 
ACOGCATGGrr 
CAGTGACAAT 
TACCOGAAGC 
ACACATCACA 
GTGATA6TGG 
TGGGAATCAC 
GT6CAAA6GC 
TGGGAGATGA 
TCAAAACAAA 
TGAAACTCTG 
T6T1TGAATC 
CTGATATTTT 
TGATTAAAAC 
TTTCTCCCCC 
ATGTACCTCT 
GGCCAGAACA 
TGGA6GAAGT 
ACCTGAAGGG 
TGCACACATT 
GGTCTGCAGC 
TGGATGCTGT 
CTAAGTCACA 
CCGCCACCAT 
GGGAACCAOG 
TGAAAAGTAG 
CTCAGACACT 
AGAGTGAGGG 
GCA6TATGTC 
ACAOCACAAC 
CTCACCCTTC 
ACCGGCACAA 
CAACTCAAGC 
CTTQGGTGGA 
CGACATCCAA 
CCCCTTCTAC 
AACATA6AAA 
TGAAAACTGA 
CATCTTACCC 
AAGAAATTAA 
CTGGTGAATC 
AATTTAAG6A 
GGACX^GCCCA 
TTACAGACCC 
CCTCTTTGAC 
CAAGCATAAA 
ATCTTGAAAC 
CTGCTGCCCG 
GACAAACCAC 
ATTCCAAGGA 
TCAACAATGA 
ACGGGGATGC 
GGAGTCTACC 
AACTAACCA6 
TGTCCACACA 
ACAAACCGGA 
CTCCAAGATT 



21 
1 

GGGGGC^TC. 
CCCX3CATCCT 
TTC03TGCCC 
ACAGGCCCT6 
CGGCAATGAG 
TTTCAAGTTC 
TAACTTAATG 
CAAOGGCTTA 
CCCCAGCACC 
CCTCTACTTA 
GCTTCTGGAG 
GTTTTTGGAA 
TGAAGGCX3GT 
ACACAAGCTG 
CAGGAGCAG6 
CCTGGAGAAA 
GAACATGGTG 
GAACCAAAOG 
TCCAAT6ACC 
TCCCGTGAAG 
CAGGCAGGAT 
ACAACCAGAA 
G6CGAAGAAG 
TACAAGGCAG 
AGATCAGACt 
GAGTCCATCT 
CCCAGACAGC 
ATCTGACTCA 
ATATA6GGXA 
T6GCAAGAAC 
CCACCTTAGC 
TGTATACATG 
TTACTACAGA 
AGTQACCAAG 
TCTTTCCAGA 
A6AGAACACT 
GGATGATGCC 
GAAGCATTCG 
TAGACGAAGG 
AGCCAAAGTC 
CACAAGTCCr 
CTCAGCATCT 
ACTTGGTGAA 
CAACCACAAT 
TGTTQATGAC 
GACAGCAGCC 
AGACACAGTC 
AGATGTTGGA 
CTCCTTGGCT 
ACCAGATGAG 
CTGGGTTAAT 
TGTCCCA6GC 
TCTAAGCACT 
ACAGGGAGGA 
CCAAGAGAGC 
TCCAGTTAAG 
ABTAACAACA 
TCGAAG6AGA 
GCAAACCCCA 
ACCTGACATT 
TAACAC3M3TT 
GGGAACACCA 
AGTGAGCTCA 
CATTGTTACT 
6GGCCCTTAT 
TAAA6TCCAA 
GGATOATGTT 
AATTACTAAT 
AGAATCCTCT 
GCCTGGGAGG 
TCCCCTTCTT 
AGTCTCCACA 
AGTGQA6GTG 
CACTGTGGCT 
GATGAAGGAO 
CACCACTAAG 
AAATGTTTTC 
AGGAACACAG 
ATTTAACTTG 
AOGTGGCTCA 
AGTCCCTGCC 
AAGCGCTTCC 
AATAACTACA 
ATCAAGTACA 



31 

1 

TCOGTGGTGC 
TGTGCCTQCT 
GCTGGCATTG 
TCAGAAACCr 
ATCCCAAGCA 
AGCTACAACA 
AGGCTGCACA 
AOGTCTCTGA 
TTCTCCACQT 
GCAGAGAACA 
AATCTTTACT 
TGGGATQCAA 
CAGTTGTGTG 
AAGOACATGA 
A6TATT6A6G 
TTCCAACTGC 
AACTTGGTCT 
GATCCTCCAG 
CX3AGAAAACT 
CTACACA6AQ 
GCTGATGAGG 
TGGGTC ATGC 
GTGCTACTTT 
GCTCGGGGCA 
6TCCTGGAAG 
ATCTTCTGGG 
AAGTTCrCCA 
GGCTTGTACC 
CTTQTQCAGT 
CCAGG6GAGT 
TGOATTCTTC 
TT6CCAAATG 
TGTGTGGCTG 
AAAGGGTCTG 
GTCAGAGAAG 
TCAAGGAGAC 
ATCAATGGAG 
GAAAAAGAAC 
ATAAACATGG 
CGTGGGAAAA 
CCATCCTTGA 
CCTGTGCAGA 
GAAOAGCAOQ 
GGAGTTATTC 
CTTTCTGAGA 
CCTACACTTA 
TATtSAAAAGC 
TCGTCACCAG 
GAGTCTGAGC 
GATAAGATGA 
GACTCCAGTA 
CAATCACATC 
CAAGACACCT 
AATATGCTAG 
AAATCCATCA 
AA6CCTGCGG 
ACACCAA66C 
CCCS^ACGGSA 
CCCACAACTT 
AAGATTTCAA 
AATACCCCCA 
CGGAGAAAAC 
AQAGOGTCOG 
CCCAGTTCAG 
QATTCCTTAG 
6AGACACTTC 
GCCACAAATG 
GCCATACCAA 
CCTGTAGGCT 
CTACAGACAO 
AAAGAGCTTG 
CCATTTCACC 
GCTTCAAGTC 
ATTCTCCTTT 
CCA6CATCCT 
CCAGCACTTC 
TTGAATTATG 
CATATGTCAG 
TCTACAAAGC 
GATAGCCAAC 
AAACCCATCC 
AGATACTTTG 
TATCCTTCTG 
ACAATTCCTC 



51 



41 

1 

TGATCCTGCT 
AOQTCCCCAG 
CTAGACACX3T 
CATTTGCAG6 
TCXIKXSSATGQ 
AGCTGAGAGT 
TTGACCACAA 
GGGTACTCCA 
TCACATTTTT 
TGGTTAGAAC 
TGCAGGGAAA 
AATCCAGAGG 
CAATGTGCTT 
CTTGTCTGAA 
AGGAGCAAGA 
CCCAGTGGAG 
GTGACATCAA 
ATATTQACAT 
ATGAAAAGCT 
A6CTCAT6CT 
AAGCTCTTTA 
AGCCATCX3VT 
CCTACTACAC 
GAAGCTGGGT 
GGGGTCCATG 
TGCTTCCAGA 
TTCTCA6CAG 
AGTGCATTGC 
CTCCCTCCAC 
OGGTGACATT 
CAAACA6AAG 
GAACTCTTTC 
TCAACCAGCA 
GCTTGCCATC 
ACATCGTGGA 
TTCTGCATCC 
ACAAOAAAGC 
CA6AGACCAA 
CAAACAAACA 
ATCTCCCTAA 
GCCTAGAAGT 
CAGTAACCAG 
TTTTGGGTAC 
TTGTTGAACC 
AGACTGAGGA 
TATCTGAGCC 
OCAOCCATGA 
AGCCCACATC 
CCATGCAATA 
AAGAAGACAC 
CATCACAGTT 
TACAAGGACT 
TACTGATTAA 
AGGGAGACCC 
CTTTGCCTGA 
AAACCACAGT 
AAAAA6TTGC 
GAAGGAGATT 
TTGCCCCATC 
GTCAAGTGGA 
AACAGTTGGA 
ACGGGAAGAG 
GATCCAAGCC 
AAACTATACr 
ATTACATGAC 
CAGTCACaVTA 
TTGACAAACA 
CTTCTOGCTC 
TTCCAGQAAC 
ACATAOCTGT 
AGGATOTGGA 
AGGAA6AAGC 
AGGCAGAAAC 
CTGAAACTAG 
08TOCCCATC 
CCAGTCX3UM3 
TGGGGAATCC 
GGCCAAATGA 
TGGAATTGGA 
GCCAGGATG6 
TACCAACAGC 
TAACTTCCCa 
G6GCTTTGCC 
TCCCATTGCA 



I 

TTGGGGCCAT 
CX3AGGTCCAC 
6GAAAGAATC 
ACTGACCAAG 
AGCrCTAAGA 
GATCACAGGA 
CAAGATCGAG 
TTTGGA AGGA 
QQATTATTTC 
TCTTOCTGCC 
TCCGTGGACC 
AATTCTGAAG 
CAGTCCAAA6 
GCCTTCAATA 
ACA6GAA6AG 
CATCTCTTTG 
GAAACCAATQ 
AAATGCAACA 
ATG6AAATTG 
CAGCAAAGAC 
CTACACAGGT 
AGATATCCAG 
CCAGTATTCT 
AATGATTGAG 
OCAGTTGAGC 
TGGCTCCATC 
TGGCTGGCT6 
TCAAGTGAGG 
TCAGCCAGCC 
GCCTTGCAAT 
GATAATTAAT 
CATCCCAAAG 
AGGGGCA6AC 
CAAAAGAGGC 
GGATGAAGG6 
AAAGGACCAA 
CAAGAAA6GG 
TGTTGCAGAA 
GATTAATCCG 
GGGCACAGAA 
CACACCACCT 
TGCIGAA6AA 
CATTTCCTCA 
TGAAGTAACA 
GATAACTTCC 
TTATGAACCA 
AGAGAC6GCA 
CAGTGAGTAT 
CTTTGACCCA 
CTTTGCACAC 
ATTTGAGGAT 
GACAGACAAC 
AAAGGGTATG 
CACACACTOC 
CTCCACACTG 
TGGTACCCTC 
TCCGTCATCC 
ACX3CCCCAAC 
AGAGACTTTT 
GAGTTCTCTG 
AATGGAGAA6 
GCCAAACAAA 
CAGCCCTTCT 
TTTGCCTAGA 
AAGCAGCA6A 
TAAACCCACA 
TAAAAGTGAC 
CTTGGTCTCC 
TCCAACCTGG 
TACCACTTCr 
TTTCACTTCC 
TGGTTCTTCC 
CACCACCCTT 
ACCACAQAAT 
CACAATTCTC 
AATATCTCAA 
AGAAACAGAA 
ATTATCAACA 
AAAGCAAGTA 
AAGA6TTCAT 
AACAGTGAGG 
GTCACCTCGT 
AGAGAACAAA 
CATGTCCAAA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
B40 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4260 
4320 
4380 
4440 
4500 
4560 
4620 
4680 
4740 
4800 
4860 
4920 
4980 
5040 



339 



wo 02/086443 

CCCAOCATTC CTAGTAAGTT TACTGACOGA AQAACTGACC AATTCAATGG TTACTCCAAA SlOO 

GTQTTaxaQAA ATAACAACAT GCCTGAGGCA AGAAACCCAG TTGGAAAGCC TCCCAGTCCA 5160 

AOAATTCCTC ATTATTCCAA TGGAAGACTC CCTTTCTTTA CCAACAAGAC TCTTTCrTTT 5220 

CCACAGTTGG QAC5TCACC0G GAGACX:CCAG ATACCCACTT CTCCTGCCCC AGTAATGAGA 5280 

GAGAGAAAAG TTATTCCAGG TTCCTACAAC AGGATACATT CCCATAGCAC CTTCCATCTG 5340 

GACTTTGGCC CTCCX3GCACC TCCGTTGTTG CACACTCCGC AGACXy^OGGO ATCaCCCTCA 5400 

ACTAACTTAC A6AATATCCC TATGGTCTCT TCCACCCAGA GTTCTATCTC CTTTATAACA 5460 

TCTTCTGTCC AGTCCTCAGG AAGCTTCCAC CAGAQCAGCT CAAAGTTCTT TGCAGGAGGA 5520 

CCTCCTGCAT CCAAATTCTG GTCTCTTGGG GAAAAGCCCC AAATCCTCAC CAAGTCCCCA 5580 

CAGACTGTGT CXXH'CACCGC TGAGACAGAC ACT6TGTT0C CCTOTGAGQC AACAGGAAAA 5640 

CCAAAGCCTT TCXHTACTTG GACAAAGGTT TCCACAGGAG CTCTTATQAC TC06AATACC 5700 

AGGATACAAC GGTTTQAQOT TCTCAAGAAC GQTACCTTAG TGATAOGGAA GGTTCAAGTA 5760 

CAAGATCX3AG GCX:7U3TATAT GTGCACCGCC AGCAACC7GC ACGGCCTGGA CAGGATGGTG 5820 

GTCTTGCTTT G66TCAC0GT GCAGCAACCT CAAATCCTAG CCTCCCACTA CXAGGACGTC 5880 

ACTGTCTACC TGGQAGACAC CATTGCAAT6 GAOTQT C T G G CCAAAGG6AC CCCAGCOCOC 5940 

CAAATTTCCT G6ATCTTCCC TGACAGGAG6 6TGTGGCAAA CTGTGTCOCC OGTGGAGAGC 6000 

CQCATCACCC TQCAOQAAAA COGGACCCTT TCCATCAAGG AGGCGTCCTT CTCAGACAGA 6060 

GGC6TCTATA AGTGCXaTGGC CAGCAATGCA GCCGGQGCGG ACAGCCTGGC CATCCGCCTG 6120 

CACGT6G0GG CACTGCCCCC GGTTATCGAC CAGGAGAAGC TG6AGAACAT CTCGCT6CCC 6180 

GOGGOGCrCA 6CATTCACAT TCACIGCACT GOCAAOGCTG CGCOOCTGCC CAGOGTGCGC 6240 

TOQGTGCTOG G6GA0GGTAC CCAGATCOGC CCCTCX3CAGT TCCTCCAGGG QAACTTGTTT 6300 

GTTTTCCCCA ACGGGACX3CT CTACATCCGC AACCTCGCGC CCAAGGACAG 0GGGC6CTAT 6360 

GAGTGGGTGG CGGCCAACCT . GGTAGGCTCC GCGCGCAGGA C6GTGCAGCT GAAOGTGCAG 6420 

C6TGCAGCAG CCAAGGCGOG CATCACGGGC ACCTCCCCX3C GGAGGAOSGA C6TCAGGTAC 6480 

GGA66AACCC TCAAGCTGGA CTGCAGOGCC TCXS36GGAGC CCTGGGCQOO CATCCTCIG6 6540 

AGGCTGCOGT CCAA6AGGAT GATCSGACGOS CTCTTCA6TT TTGATAGCA6 AATCAAG6TG 6600 

TTTGCCAATG GQACCXrTGGT OGTQAAATCA GTGACGGACA AAGATGCCGG AGATTACXTTG 6660 

TGCGTAGCTC GAAATAAGGT TGQTGATGAC TACXSTGGTGC TCAAAGTGGA TGTGGTQATG 6720 

AAACX:GGCCA AGATTQAACA CAAGGAGQAG AACX3ACCACA AAGTCTTCTA CGGGGGT6AC 6780 

CTSAAAGTOO ACTQTGTGGC CACGQGGCTT CCOUVTOCGO AGATCTOCTG GAGCCTCCCA 6840 

GAOGGGAGTC TGGTGAACTC CTTCATGCA6 TOGGATYSICA GOGGTGGACG CACCAA606C 6900 

TATGTGGTCT TCAACAATGG GACACTCTAC TTTAA06AAG TGGGGAT6A6 G6AGGAAGGA 6960 

GACTACACCT GCTTTGCTGA AAATCAC3GTC GGGAAGGAOG AGATGAGAGT CAGAGTCAAG 7020 

GT6GTGAGAG OGCCCGCCAC CATCC36GAAC AAGACTTACT TGGOGGTTCA GGTGOCCTAT 7080 

GGAGAOGTGG TCACT6TAGC CXGTQAGGCC AAAGGAGAAC CCATGOCCAA GGT6ACTTGO 7140 

TTGTCCCm CCAACAAGGT 6ATCCCCACC TCCTCTGAGA AOTATCAGAT ATACCAAGAT 7200 

GGCACTCTCC TTATTCAGAA AGCCCAGCGT TCTSACAGCX5 GCAACTACaC CTGCCTQGTC 7260 

AGGAACAGCG CGGGAGAGGA TAGGAAQACG GTOTQGATTC AOOTCAACGT CCAGCCACCC 7320 

AAGATCAACG GTAACCCCAA CCCCATCACC AC0GTGC6GG AGATAGCAGC OGGGGGCAGT 7360 

CG6AAACTGA TTQACTGGAA AGCTGAA06C ATCCCCACCC OGAGOGTGTT ATGGGCTTTT 7440 

CCCX3AGGGT6 TGGTTCTGCC A6CTGCATAG TATG6AAACC GGATCACTGT CCAT06CAAC 7500 

G6TTCCCTGG ACATCAGGAO TTTQAGGAAG AGOGACTCOS TCCA6CT6GT ATGCATGGCA 7560 

CGCAAC6AGG GAGGGGAGGC 6AGGTTGATC GTGCAGCTCA CTGTCCTGGA GCCCATGGAG 7620 

AAACCCATCT TCCAOSACXX: GATCAGCGA6 AAGATCA006 CCATG60GGG CCACACCATC 7680 

AQCCTCAACT GCICT G COGC GGGGA0C008 ACACXXAGOC TG0TGTQ60T CCTTCCCAAT 7740 

G6CAC06ATC TGCAGAGTGG ACAGCAGCTG CAGG6CTTCT ACCACAA60C TGACGaCATO 7800 

CTACACATTA GOGGTCTCTC CTCGGT6GAC GCTOGGQCCT ACJCGCTGOGT GGCCC3GCAAT 7860 

GCOGCTGGCC ACACGGA6AG GCTGGTCTCX CTGAAGGTGG GACTGAAGCC AGAAGCAAAC 7920 

AAGCAGTATC ATAACCTGGT CAGCATCATC AATOGTGAGA CCCTGAAGCT CCCCTGCACC 7980 

CCTCCCX3GGG CTGGGCAGG6 AOGTTTCTCC TGGACGCTCC CCAATGGCAT GCATCTGGAG 8040 

G6GCCCCAAA CCCTGGGAGO Otfr iT U l ' LTl ' CTGGACAATG GCACCCTCAC GGTTOGTGAG 8100 

GCCTCGGTGT TTGACAGGGG TACCTATGTA TGCAGGATG6 A6A0GGAGTA CGGCCCTTCQ 8160 

GTCACCA6CA TCCCOGTQAT TGTGATCGCC TATCCTCCCC GGATCACCAG CXSAGCXXACC 8220 

COGGTCATCT ACACCOSGCC OGGGAACACC GTGAAACTGA ACTGCATGGC TATGGGGATT 8260 

C0CAAAGCT6 ACATCAGQTO aOAOTTAGOG GATAAGTCGC ATCTGAAGGC AG(^?GTTCAG 8340 

OCTOGTCTGT ATGGAAACAO ATTTCTTCAC CCCCAGGGAT CACTGACCAT CCAGCATGCC 8400 

ACACAQAQAQ ATGCOGGCTT CTACAAGTGC ATGGCAAAAA ACATTCTCGG CAGTGACTCC 8460 

AAAACAACTT ACATCCACGT CTTTCTGAAAT GTG6ATTCCA GAATGATTGC TTAGGAACTG 8520 

ACAACAAAGC GGGGTTTGTA AGGGAAGCCA GGTT06GGAA TAGGAGCTCT TAAATAATGT 8580 

GTCACAGTGC ATGGTGGCCT CTGGTGGGTT TCAAGTTGAG GTTGATCTTG ATCTACAATT 8640 

GTTGGGAAAA GGAAGCAATG CAGACACGAG AAGGAGGGCT CAGCCTTGCT GAGACACTTT 6700 

CTTTTGTGTT TACATCAT6C CAGGGGCTTC ATTCAGGGTG TCTaTGCTCT GACTGCAATT 8760 

TlT C r r C r r T TQCAAATQCC ACTOQACTGC CTTCATAAQC GTCCATAGGA TATCTGAGGA 8B20 

ACATTCATCA AAAATAAGCC ATAGACATGA ACAACACCTC ACTACCCCAT T6AAGAC3QCA 8880 

TCACCTAQTT AACCTGCTGC AGTTTTTACA TGATAGACTT TGTTCCAGAT TGACAAGTCA 8940 

TCTTTCAGTT ATTTCCTCTG TCACTTCAAA ACTCCAGCTT GCCCAATAAG GATTTAGAAC 9000 

CAGAGTGACT GATATATATA TATATATTTT AATTCAGAGT TACATACATA CA GCTA CCAT 9060 

TTTATATGAA AAAAGAAAAA CATTTCTTCC TQGAACTCAC TTTTTATATA ATGTTTTATA 9120 

TATATATTTT TTCCTTTCAA ATCAGACGAT GAGACTAGAA GGAGAAATAC TTTCTGTCTT 9180 

ATTAAAATTA ATAAATTATT GGTCTTTACA AGACTTGGAT ACATTACAGC AQACATGGAA 9240 

ATATAATTTT AAAAAATTTC TCTCCAACCT CCTTCAAATT CAGTCACCAC • TGTTATATTA 9300 

CCTTCTCCAG 6AACCCTCCA GTGGGGAA66 CTGOQATATT AOATTTCCTT GTATGCAAAG 9360 

TTTTTGTTGA AAGCT6T6CT CA6A0GAGGT GAGAOGAOAG OAAGQAGAAA ACT6CATGAT 9420 

AACTTTACAG AATTGAATCT AGAGTCTTCC CCGAAAA6CC CAGAAACTTC TCTGCAGTAT 9480 

CTG6CTT6TC CATCTG6TCT AAG6TGGCTG CTTCTTCCCC AGCCATGAGT CStfSTTTGTGC 9540 

CCATQAATAA TAC3V0GACCT GTTATTTCCA TQACTGCTTT ACTGTATTTT TAAGGTCAAT 9600 
ATACTGTACA TTTGATAATA AAATAATATT CTCCCAAAAA AAAAA 

Seq ZD NO: 417 Protein sequence 
Protein Accession ft: HP_056234.l 

1 11 21 31 41 51 

I I t I I 1 

MPKRAHH6AL SWLZLIiNGH PRVALACPHP CAOrVPSEVB CTFRSLASVP A6XARBVBRZ 60 

NLGFNSIQAL &ET8FAGLTK LBLLMIHGNB IPSIPDGALR DLSSLQVFKF SYNKLRVITG 120 

QTLQGLSNLM RLHIDHNKIB FIHFQAFNGL TSLRLLHLEG NLLHQUIP8T FSTFTFLDYF 180 

RLSTIRKIiYIi AENMVRTLPA SMItRHMFUiB NLYUIGHPHT CDCEMRHFLB WOAKSRGILK 240 
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dOCDKAYEGG QLCAMCFSPK KLYKHEZHiCL iCDMTCLKPSZ E8PLRQNRSR SIEEEQEQEB 300 

DGGSQLII.BK FQI.PQHSZ6L NHTDSKGNHV NLVCDIKKFN PVYKIBIWr OPFDXOIKAT 360 

VALDFECFMT RENYEKLWiCL lAYYSBVFVK LBREUOiSKD PSVSYQYRQD ADBEALYYTO 420 

VRAQILAEPE WVMQPSIDIQ LNRRQSTAKK VLLSYYTQYS QTISTKDTRQ ARGRSWVMIE 480 

FSGAVQROQT VLEGGPCQLS CNVKASESPS IFWVLPDQSI LKAPMDDPDS KFSILSSGWL 540 

RIKSHEPSDS QLYQCIAQVR DEMDRNVYRV LVQSPSTQPA EKDTVTIGKN POESVTIiPCN 600 

AIAIPEAHL8 WILPNRRIIN DIiAHTSBVYM LPNGTLSIPK VQVSDSGYyR CVAVllQQGAD 660 

HFTVGITVTK KGSGLPSKR6 RRPGAKALSR VREDIVE0E6 6SGM6DBENT SRfiLLHPKDQ 720 

EVFLKTKDDA INGDKKAKKO RSKLKLWKHS BKEPBTNVAE GRRVFESRSR XNMANKQINP 780 

ERWADILAKV RGKNLPKGTE VPPLIKTTSP PSLSIiEVTPP FPAVSPPSAS PVQTVTSABB 840 

SSADVPLLGE EBHVLGTISS ASMGLEKZniN 6VZLVEPEVT STPLEEWDD LSEXTBEITS 900 

TBCSDLKGTAA PTLISEPYEP SPTLHIUDTV YEKPIHSBTA TEGMSAADVG SSPSPTSSBY 960 

EPPLDAVSLA ESEFMQYFDP DLETKSQPDE DKMKS3TFAH LTPTPTIWVN DSSTSQLFED 1020 

STIGBPGVPO QSHLQGLTDN ZHLVKSSLST QDTX^IiZKKGM KEMSQTLQGG NMLEGDPTHS 1080 

R9SESEGQES KSZTLPDSTL GIMSSMSPVK KPAETTVGTL LDKDTTTVTT TPRQKVAPSS 1140 

TMSTHPSRRR PNGRRRLRPK KFRHRHKQTP PTTFAPSBTP STOPTQAPDZ KIS^QVESSL 1200 

VPTAMVI»rrV MTPXQLEMBK NAEPTSKGTP RRKHGKRPNK HRYTPSTVSS RAS6SKPSPS 1260 

PENKHRITZVT PSSETZLLPR TVSLKTEGPY DSLDYMTTTR KZYSSYPKVQ ETLPVTYKPT 1320 

SDQKEIKDDV ATNV0KHKSD ZLVTGESITN AZPTSRSLVS TKGEFKEESS PVGFPSTPTW 1380 

NPSRTAQPGR LQTDZPVTTS GENLTDPPLL KELEDVDFTS EFLSSLTVST PFHQEEAGSS 1440 

TTLSSZKVEV ASSQAETTTL DQDHLETTVA ZLLSETRPQN HTPTAARMKE PASSSPSTZL 1500 

MSLGQTTTTK PALPSPRZSQ ASRDSKENVF LMYVCaiPETE ATPVNNEGTQ HMSGPNELST 1560 

P8SDRDAFNL STKIiELEKQV FGSRSLPRGP DSQRQDGRVH ASHQLTRVPA KPZLPTATVR 1620 

LPEMSTQSAS RYPVTSQSPR HWTNKPEZTT YPSGALPENK QPTTPRIiSST TZPLPLHMSK 1680 

PSZPSKFTDR RTDQFKGYSK VFGMt»TZPEA RNPVGKPPSP RIPHYSI9GRL PFFTNKTL8F 1740 

PQLGVTRRPQ ZPTSPAPVMR ERKVIPGSYN RZHSHSTPHL DFGPPAPPLIi HTPQTTGSPS 1800 

TNLQNZPMVS STQSSISPIT SSVQSSGSPH QSSSKPPAGG PPASKPWSLQ EKPQZLTKSP 1860 

QTVSVTABTD TVFPCEATGK PKPFVTWTKV STGALMTPNT RIQRFEVLKN QTLVIRKVQV 1920 

QDR6QYMCTA SNLHGLDRMV VLLSVTVQQP QZLASHYQDV TVYL(a>TZAM ECLAKGTPAP 1980 

OZSHZPPDRR VHQTVSPVBS RITLHENRTL SIKEASPSDR GVYKCVASKA AGADSLAZRL 2040 

HVAALPPVZH QEKLENZSLP PGLSZHZHCT AKAAPIiPSVR WVL6DGTQZR PSQFLHGNLF 2100 

VPPNGTLYZR KIAPKDSGRY ECVAANLVGS ARRTVQUIVQ RAAANARZTG TSPRRTDVRY 2160 

GGTLKLDCSA SGDPWPRZLW RLPSKRMIDA LFSFDSRZKV FANGTLWKS VTDKDAGDYL 2220 

CVARNKVGDD YWLKVITWM KPAKZBHKBE NDEKVFYG6D IiKVDCVATGL PNPBZSWSLP 2280 

DGSLVHSFHQ 80DSGGRTKR YWFNNQTLY FNEVGMREBO DYTCFABHQV GKDEMRVRVK 2340 

WTAPATZRN KTYLAVQVPY GDWTVACBA R6EPMPKVTW LSPTNKVZPT SSEKYQZYQD 2400 

GTLIjXQKAQR SDSGNYTCLV SNSAGBDRKT VWIEVNVQPP KZNGNPNPZT TVREZAAGGS 2460 

RKLZDCKAEQ ZPTPRVLMAF PEGWLPAPY YGNRZTVHCai GSLDZRSLRK SDSVQLVCMA 2520 

RNEGGBARLI VQLTVLEPME KPZFUDPZSE KZTAMAGETZ SLKCSAAGTP TPSt*VNVLPN 2580 

OTDLQSGQQL QRFYBKAOQH LBI6GLSSVD AQAYRCVARN AAGBTERLVS XiRVaLKPSAN 2640 

KQYHNLVSZI HGETLRLPCT FPGA0Q6RFS WTZtPNGMHLB GPQTLGRVSL LDHGTLTVRE 2700 

ASVFDRGTYV CRMETBYGPS VTSZPVZVZA YFFRZTSEPT PVZYTRFGNT VKLNCMAMGI 2760 

PKADITNELP DKSHLKAGVQ ARIiYGNRFIA PQGSLTZQHA TQRSAGFYKC HAKNZLGSDS 2820 
KTTYZHVP 

Seq ZD NOt 418 DNA sequence 

Nucleic Acid Accession lit Eos sequence 

Coding sequence i 1..5001 

1 IX 21 31 41 51 

I I I 1 I I 

ATGCCAGGCA CAAAACTAAC CC3GAACAGGC GCCCCAGCAG ACTACAOAGT GATATTGAAG 60 

ACCTCTCAAG AGGACGAATT GGATGTACCT GACGACATCA GCGTCOSGGT TATGTCATCT 120 

CAGTGTGTGC TTGTGTCCTG GGTGGATCCT GTTCTGGAAA AACAGAAGAA AGTTGTTGCA 180 

TCAAGACAOt ACACGGTGOG CTATCX3AGA6 AAGGGGGAAT TG6CCA6GT6 GGATTATAAG 240 

GAGATCGCTA ACAGGOGTGT GCTGATTGA6 AACCTGATTC CA6ACACTGT GTATGAATTT 300 

GCAGTCCGTA TTTCACAGGG TGAAAGAGAT GGCAAATGGA GTACGTCAGT CTTCCAAAGA 360 

ACACCAGAAT CTGCCCCTAC CACAGCTCCT GAAAACTTGA ACGTCTGGCC AGTCAATGGC 420 

AAACCTACAG TTGTCGCTGC ATCTTGGGAT GCGCTACCAG AGACTGAGGG GAAAGTGAAA 480 

GTCTGTCTGC TGQACACAGG ACTGTTTTCA GTTTCCTCCT TCCAACCATC TGCCAAATCA 540 

TTTCAGAATA CATTCTTTCA TACGCCCOSG CTCTCAAACC ATTTGGAGCA AAGTCCCTCA 600 

CCTATCCTGG AGACACTACT TCTGCCCTGG TGGATGGTCT GCAGCCT6GG GAACGCTATC 660 

TTTTCAAAAT OCGGGCCACA AACAGGAGAG GCCTGGGACC TCACTCCAAA GO CTTC ATTG 720 

TCQCTATGCC AACAAGAATG CAGCTGTACC C3WSAAGGATT TCA6TTCTCT AGCTTACCTG 780 

ATOGATATCC AAACCAAACA AGTTAATAAA GATCCACAAC TGGAAGGGAG T6TTTTTGGA 840 

CCATGTTTTC TTTTCTACTT CCTCACATTT ATGCTGGATA TTGGOGGCTT TTCCTTCATT 900 

ATQT6CTATG AA6ACCCANN TGTTTCTTCT TTQACAGGCA ATTCTTTAAA ATCT6TTGCA 960 

GCCAGTAAGG CGOATGrTCA 6CAGAACA0G GAGGACAATG GGAAACCOGA AAAACCTGAG 1020 

CCTTCCTCAC CTTCTCCCAG AGCTCCAGCT TCCTCECAAC ACCCCTCTGT GCCTGCTTCT 1080 

CCCCAAGGGA GAAATGCCAA GGACCTTCTT CTTGACTTQA AGAACAAAAT ATTGGCTAAT 1140 

GGTGGGGCGC CCCGAAAACC CCAGCTTCGC GCCAAGAAGG CAGAGGAGCT GGATCTTCAG 1200 

TC6ACAGAAA TCACTGGGGA GGA6GAGCTG GGTTCCGQGG AGGACTCGCC CATGTCACCC 1260 

TCAGACACCC AASACCA6AA AOGGACCCTG AGGCCGCCAA GTAGACAOGG CCACTOGSTO 1320 

GTTGCTCCC6 GCAGGACTGC AGTGAGGGCC C6GATGCCAG CGCTGCCC06 AA6GGAAG6C 1380 

GTAGATAAGC CTGGCTTTTC CCTGGCCACX5 CAGCCCCGCC C3W3GGGCGCC CCCCTCGGCT 1440 

TCGGCCTCTC CTGCCCACCA CQOSTCCACC CAGGGCACCT CTCATCGTCC TTCCCTGCCT 1500 

GCCI^GCrrGA ATGACAAC6A CTTGGTGGAC TCAGAOGAAG ATGAGCGC6C T6TGG6CTCC 1560 

CTCX3kCCCCA AGQGCiaCCTT GGCCCAGCCC OQGCCAGCCC TGTCCCCGAS CC6CCAGTCC 1620 

CCGTCCAGCG TTC7COGCX3A CAGAAGCTCT QT6CACCCCG 60GCAAAGCC AGCCTCGCXX3 1680 

GCGCGGAGGA CCCCCCATTC AGGGGCOGCA QAGQAAGATT CCAGTGCCTC AGCCCCACCC 1740 

TCAAGACTTT CTCCACCCCA TGGGGGATCA TCTCGGCTGC TGCCCACCCA GCCACACCTG 1800 

AGCTCTCCAC TTTCCAAGGG CGGGAAGGAT GGTGAGGACG CTCCAGCCAC CAACTCX:aAT 1860 

GC6CCATCAC GGTCCACCAT GTCCTOCTCC GTCTCTTCTC ATCTCTCX3TC CA6GAC3GCAG 1920 

GTCTCTGAGG GAGCGGAGGC TTCTGATGGT QAAAGCCACG GTGACGGCX3A TAGG6AAGAC 1980 

GGCGGAAGGC AGGCGGAGGC CACGGCCX»G ACGCTGCGGG CmXSCCTGC CTCTGGACAC 2040 

TTCCATTTGC TCAGACACAA ACCCTTTGCT GCCAACGGGA GGTCTCCAAG CAGGTTCAGC 2100 

ATTGGGCGGG GACCTCGGCT GCAGCCCTCC AGCTCCCCAC AGTCGACTGT GCCCTCCCGA 2160 
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GCCCACCCCA GGGTTCCCTC TCACTCTCSAT TCCCACCCTA AGCTTAGCTC AGGTATCCAT 2220 

GGAOACGAGQ AGGATSAGAA GCGGCTTCCT GGCAC06TTG TCAATQACCA CGTGCCTTCC 2280 

TCCTCCftGGC AGCCCATCTC CC0G(3GCTGG GACGACTTAA GGAGAAGCCC 6CAQAGAGGG 2340 

GCCAGCCTGC ATCGGAAGGA ACCCATCCCA GAGAACXXXA AATCCACAGG GGCAGATACA 2400 

CATCCTCAQG GCAAGTACTC CTCCCTGGCC TCCAAGGCTC AGGATQTTCA ACAQAGCACA 2460 

OACGCGGACA CGGAGGGTCA TTCTCCCAAA GCACAGCCAG GGTCCACAGA CCX3CCACGCG 2520 

tCCOCSQCCC GTCCTCeXBC AGCAOGGTCA CA6CA6CATC CCAGTCTTCC CAGAAGOATG 2580 

ACaCCOGQCC GGGCCCC3«3A ACAGCAGCCC CXTTCCTCCXXS TOQCCAOGTC CCAflCACCAC 2640 

CCGGGACXXX: AGAGCAGAGA OGCGGGTCGG TCACCTTCCC AGCCCftGOCT CTCACTGACC 2700 

CAGGCCGGGC GGCCCCGCCC CAOGTCGCAG GGCCGCTCCC ACTCCTCOTC GGACCCTTAC 2760 

ACGGCGAGCT CCAGAGGGAT GCTCCCCACG GCCCTCCAGA ACCAQQACQA GGATQCCCAS 2820 

GGCAGCXAOG A0GAC6ACA6 CACAQAASTC GAQGCCCAGG ATGT6G666C CCCCX3CGCAC 2880 

GCOGOGOGOG CCAAGGAGGC AGCTGCGTCC CTTCCCAAGC ACCAGCAGGT GGACTCTCXX 2940 

ACAG6CGCAG GGGCAGGTGG OGACCACAGG TCCCAGOQCG GACATGOGGC CTCCCCCGCC 3000 

AGGCCCAGCC GACCOGGOGG CCCCCAGTCC OGCGCCCGGG TCCCCAGCAG GGCAGOGCOS 3060 

GGGAAGTCGG AGCCTCCTTC CAAGCGGCCC CTGTCCTCCA AGTCCCAGCA 6T0GGTCTCA 3120 

6GGGAGGAC6 AGQAGGAGOA GGAOSOGGGG TTTTTTAAAG GCGGOAAAGA AGACXnTCTG 3180 

TCTTCCTCTG TGOCAAAGTG GCCCTCTTCC TCCACTCCCA GGGGOGGCAA AQACGCCGAT 3240 

QGGAGCCTOG ciCAAGQAAGA GAGGGAGCCT GCCATCGOGC TTGCCCCTCX3 COGAOGGAGC 3300 

CTGGCTCCTG TGAAGCGACC TCTCCCCCCA CCTCCAGGCA GCTCCCCCAG GOCCTCCCSiC 3360 

GTCCCTTCCC aACCGCOaCC TOSCAGOGCT GCCACCGTGA GCCCC3GTCGC 6GGCACCCAC 3420 

CCCTGGCCGC GGTACACCAC GOGCGCCCCV CCTGGCCACT TCTCCACCAC CCCGATGCTG 3480 

TCCTTGQGCC AGAGGATGAT GCATGCCAGA TTCCGTAACC CTCTCTCCCX3 ACAGCCTGCC 3540 

A6ACCCTCTT ACAGACAAGG TTATAATGGC AGACCAAATG TAGAAQOGAA AGTCCTTCXT 3600 

GGTAGTAATG GAAAACCGAA TGGACAOAGA ATTATCAATG GGCCTCAAG6 AACAAAGTGG 3660 

GTTGTGGACC TTGATCGTGG GTTAGTATTG AATGCAGAAO GAAGGTACCT CCAAGATTCA 3720 

CATGGAAATC CTCTTCGGAT TAAACTAGGA QGAGATQGTC GAACCATTGT AGATCTGGAA 3780 

GGGACXCCCX3 TGGTGAGTCC TGACOGCCTC CCACTCTTTG GGCAGGGGCG ACATGGCACA 3840 

CCTCTQGCCA ATGCC3CAAGA TAAGCCAATT TTGAGTCTTG GAGGAAAGCC 6CTGGTGGGC 3900 

TTGGfiGGTCa TCAAAAAAAC CACOCATCCC CCTAOCACTA CCATGCAOCC CACCACTACT 3960 

ACGACGCXXX: TGOCTACCAC TACAACCCCG AGGCCCACCA CTGCCACCAC CATGCAGCCC 4020 

ACCACTACTA CXSAOSCCCCT GCCTACCACT ACACCX3AGGC CCACCACTQC CACCACCCGC 4080 

QGCAOGACCA CCAGGCGTCC AACAACCACA GTCCGAACCA CTACGCGQAC AACCACCACC 4140 

ACCACCCCCA AACCCACCAC TCCCATCCCC AOCTGTCCCC CTGGGACCTT GGAAOGGCAC 4200 

GA06AT6AT0 GCAACCTGAT AATGAGCTCC AATGOSATCC GAGAOraCTA CGC TSAAG AA 4260 

QATGAiGTTCT CAOGCTTGGA GACT6ACACT GCASTACCTA OGQAAOAGGC CTAOGTTATA 4320 

TATGATGAAG ATTATGAATT TGAGACX3TCA AGGCCACCAA CCACCACTGA GCCTTCGACC 4380 

ACTGCTACCA CACXXSAGG6T GATCXICAGAG GAAGGOGCCA T CAGT TCCTT TCCTGAAGAA 4440 

GAATTTGATC TGGCTGGAAO GAAAOQATTT GTTGCTCCTT AOGTGAOGTA CCTAAATAAA 4500 

QACCCATCAO CCOOSTOCTC TCTGACT6AT 6CACT66ATC ACTTOCAAaT GG ACftGC CTG 4560 

6AT6AAATCA TCCCCAATGA CCTGAAGAAG AOTGATCTGC CTCOCO^GCA T6CTOCCC98C 4620 

AACATCACCG TGQTGGCCGT GGAAGGTTGC CACTCATTTG TCATTGTGGA TTGGGACAAA 4680 

GCCACCCCAQ GAGATTTGGT CACAGGTTAT TTGGTTTACA GTGCATCCTA TOAAGATTTC 4740 

ATCAGGAACA AGTTTTCCAC TCAAGCTTCA TCAGTAACTC ACTTGCCCAT TOAGAACCTA 4800 

AAGGCCAACA 06AGGXATTA TTTTAAA0IG CAA6CACAAA ATCX^TCATGO CTA08GACCT 4860 

ATCA0C0C7T OQGTCTCATT TGTCACCGAA TCASATAATC CTCTOCTTOT TOTQAGGCCC 4920 

CCAGGCGGTG AGCTATCTGG ATCCCATTOS CTTTCAAACA TGATCCCAQC TACAOGG ACT 4980 

GCCATGGAOG GCAATATGTG AAGOGCACOT GGTATOGAAA GTTCGTGGGA GTTGTTCTTT 5040 

GTAATTCaCT GAGGTATAAA ATCTACCTCA GTGACAACCT GAAAGATACA TTCTACaGCA 5100 

TTGGAGACAG CTGGQGAAGA GGTGAAGACC ATTGCCAATT TQTGGATTCA CACCTTGATG 5160 

GAAGAACAGG GCCTCAGTCC TATOTAQAAG CCCTCOCTAC TATTCAAGGC TACTATC6CC 5220 

AGTATCGTCA GGAGCCTGTC AGGTTTGGGA ACATGGGCTT CGGAACCCCC TACTACTATG 5280 

TGGGCTGGTA CGAGTGTGGG GTCTCCATCC CTGGAAA6TG GTAATCACAG GACCGTCATG 5340 

CTGCAAQCTT GCCCTGCCCA GCCCCACCAA CTAAGTOGCA CTAGGGGCTG TGAGCAAAGA 5400 

CA6GCAGCAT GCTCAGCCCC GCXGCCCTAO GTGCGAGGAA GGTCACAGAT GGACACT6GC 5460 

CATTCTQGTC ATCTCAGTCT aGAACTC3«3T CCCACTTCTT GGCCTGGACA ATGAACAGGA 5520 

TTCAGTTTTG CTGTTAACTT TGCTTCTCTA CTTTTTTTTG TTTGTTTGTA ATA6CACATC 5580 

CCAQAGACAT CAGAAACCAG CAACTGATTC AGTGTGATTT CCCAGACTTT TTAOOCATGA 5640 

AATTCGQACA CTTCAGTATT TCCAGGAATA GCATATGCAC GCTQTTCTTG CTTCATGGAA 5700 

'TOCTACATGC TiT C m rm ' TCTCATTTTG GATTTCTCCA AAACTAACTO AATTTAAOCT 5760 

TGAG6TCCCT TTGTAT6CAG TAGAAAGQAA TTATTAAAAA CACCACCAAA GAAAATAAAT 5820 

ATATCCTACT TGAAATTTAC TCTATGGACT TACCCACTGC TAGAATAAAT GTATCAAATC 5880 

TTATTTGTAA ATTCTCAATT TTQATATATA TATGTATATA TQCATATACa TATCCACACT 5940 

TGTCTGCAAG AATATTGATT AAAATTGCTA AATTTGTACT TGTTCACCAA AAAAAAAAAA 6000 
AAAAAAA 

Seq ZD NO: 419 Protein eegaence 
Protein Accession Boa sequence 

1 11 21 31 

) I t I 



41 


51 
1 




1 

QSVLVSWVDF 


VLEKQKKWA 


60 


AVRISQGERD 


GKHSTSVFQR 


120 


VCLL0T6LFS 


VSSFQPSAKS 


180 


FSKS6FQTGE 


AMDLTPKPSL 


240 


PCPLPYFLTF 


MLDIGGPSFI 


300 


8SPSPRAPAS 


SQEPSVPASP 


360 


TBZTQBEBLQ 


8SBD8PK8PS 


420 


DXPGPSIiATQ 


PRPCSVPPSAS 


480 


HPKGAFAQPR 


PALSPSRQSP 


540 


RLSPPHGGSS 


RLLPTQPHLS 


600 


SEGABASD6E 


SHGDGDREDG 


660 


6RGPRLQPSS 


SPQSTVFSRA 


720 


SRQPZSRGWE 


DLRRSPQRGA 


780 


ADTEGHSPKA 


QPGSTDRHA8 


840 


GPQSRDAGRS 


PSQPRLSLTQ 


900 


SYDDOSTEVE 


AOOVRAPAHA 


960 
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ASAKEAAASL PKHQQVeSPT QAGAGGDBRS QRCSIAASPAR PSRPGGPQSR ARVPSRAAP6 1020 

KSEPP8KRFL SSKSQQSVSA EDEBEEDAGF FIQGGKEDIiLS SSVPKNPSSS TPRGGKDADG 1080 

8LAKEBRBPA ZAIAPRGGSL APVKRPLPPP PGSSFRASRV PSRPPPRSAA TVSPVAGTHP 1140 

WPRYTTRAPP GHPSTTPMLS LRQRMMHARF RMPLSRQPAR PSYRQGYNGR PNVEGKVLPG 1200 

SNGKPNGQRI INGPQGTKWV VDLDRGLVLN AEGRYLQDSH GNPLRIKLGG DGRTIVDLEG 1260 

TPWSPDGLP LFGQ6RHGTP LANAQDKPIL SLGGKPLVOL EVIKKTTHPP TTTMQPTTTT 1320 

TPLPTTTTPR PTTATTMQPT TTTTPLPTTT PRPTTATTRR TTTRRPTTTV RTTTRTTTTT 1380 

TPKPTTPIPT CPPGTLERHP DDGNLIMSSH GIPECYJiEED EFSGLETDTA VPTBEAYVIY 1440 

DEDYEPETSR PPTTTEPSTT ATTPRVIPEB GAISSPPEEE FDIAGRKRFV APYVTYLNKD 1500 

PSAPCSLTDA UJHFQVDSLD EIIPNDLKKS DLPPQHAPRN ITWAVBGCH SFVIVDWDKA 1560 

TPGDLVTGYL VYSASYBDPI RNKPSTQASS VTHIiPIENLK PNTRYYFKVQ AQNPHGYGPI 1620 
SPSV8FVTBS ONPLLWRPP G0EL86SHSL 8NMIPATRTA ND6HM 



Seq ID NO: 420 DNA Sequence 

Nucleic Acid AccesBion #: MM_022743 

Coding sequence: 12 8.. 12 37 

1 IX 21 31 41 51 

) I ] I 1 I 

GTGGATTTTA.GAGATACXrrC CCCTCCTTCT GCTCAGCTGC CTTGCAGTAA TTAAACTCTT 60 

TCTCTGCTGC AACACCCCTA CTGTTCTCCX5 TGTATTGGCT TTTCTGGGCA GCAGGAAGGA 120 

AAAGCTGATG OGATGCTCTC AGTOCGGCXrr CX3CCAAATAC T6TAGT6CTA AGTGTCAGAA 180 

AAAA6CTTGG CXAGACO^CA AGOSGGAATO CAAAT6CCTT AAAAGCTGCA AACCCAGATA 240 

TCCTCCAGAC TCCGTTOGAC TTCTTGGCAG AGTTGTCTTC AAACTTATGG ATGGAGCACC 300 

TTCAGAATCA GAGAAGCTTT ACTCATTTTA TGATCTGGAG TCAAATATTA ACAAACTGAC . 360 

TGAAGATAAG AAAGAGGGCC TCAGOCAACT C6TAATGACA TTTCAACATT TCATGAGAGA 420 

AGAAATACA6 GAT6CCTCTC A6CTGCCACC TGCCTTTGAC CTTTTTGAAG CCTTTGCAAA 480 

AGTGATCTGC MCTCTTTCA CCATCTGTAA TGCGGAGATG CAGGAAGTTG GTGTTQGCCT 540 

ATATCCCAGT ATCTCTTTGC TCAATCACAG CTGTGACCCC AACTGTTCX3A TTGTGTTCAA 600 

TCGGCCCCAC CTCTTACTGC GAQCAGTCCG AGACATCGAG GTGGGAGAGG AGCTCAOCAT 660 

CTGCTACCTG GATATGCTGA TGACCAGTGA GGAGCGCCX3G AAGCAGCTGA GGGACCAGTA 720 

CTGCTTTQAA TGTGACTGTT TCCGTTGCCA AACCCAGGAC AAGGATGCTG ATATGCTAAC 780 

TGGTGATGAG CAAGTATGGA AGGAAGTTCA AGAATCCCTG AAAAAAATTG AAGAACTGAA 840 

GGCACACTGG AA6TGGGAGC AGGTTCTGGC CATGTGGCA6 GGGATCATAA GCAGCAATTC 900 

T6AACX3GCTT CCCX^TATCA ACATCTACCA GCTGAAGGtG CTOQACTGCG CCATC^TGC 960 

CTGCATCAAC CTCGGCCTGT TGGA6GAAGC CTTGTTCTAT GGTACTOGGA CCATG6AGCC 1020 

ATACAGGATT TTTTTCCCAG GAAGCCATCC CGTCAQAGGG GTTCAAGTGA TGAAAGTTGG 1080 

CAAACTGCAG CTACATCAAQ GCATGTTTCC CCAAGCAATG AAGAATCTGA GACTGGCTTT 1140 

T6ATATTATG AGAGTGACAC ATGGCAGAGA ACACAGCCTG ATTGAAGATT TGATTCTACT 1200 

TTTAGAAGAA TGGQAGGCX31 ACATCAGA6C ATCCTAAGGG AACX3CASTCA GAGQGAAATA 1260 

OGGCGTGTGT CTTTGTTGAA TGCCTTATTG AGGTCACACA CTCTATGCTT TOTTAGCrGT 1320 

GTGAACCTCT CTTATTGGAA ATTCTGTTCC GTGTTTGTGT AGGTAAATAA AGGCAlGACAT 1380 

GGTTTGCAAA CCACAAGAAT CATTAGTTGT AGAGAAGCAC OATTATAATA AATTCAAAAC 1440 
ATTTGGTTGA GGATGCX:AAA AAAAAAAAAA AAAAAAA 



Seq ID NO: 421 Protein sequence 
Protein Accession ft: NP_073580 

1 11 21 31 41 51 

111)11 

MRCSQCRVAK YCSAKC9QKKA WPDHKRECKC IiKSCKPRYPP DSVRIJLiGRVV FKUffiGAPSE 60 

SEKI»YSFYDL ESNINKLTED KKEGLRQLVM TPQHFMREEI QDASQLPPAP DLPEAPAKVI 120 

CNSFTICNAE MQEVGVGLYP SISLLNHSCD PNCSIVPNGP HLLLRAVRDI BVGBELTICY 180 

LDMIiMTSEER RKQIiRDQYCF ECDCFROQTQ DKDADMLTGD EOVWKBVQES LKKIEBLKAH 240 

tnCNEQVIiANC QAIIS8NSBR LFDZNZYQLK VLDCANDACI NLQLLESALF YGTRTMBPYR 300 

IPFPGSH9VR GVQVHRVaKL QLHQGMFPQA MKHLRLAFDI MRVTHGREBS LIEDLZLUiE 360 
ECDANIRAS 



Seq ZD NO: 422 OKA sequence 

Nkicleic Acid Accession #t MM_003014.2 

Coding sequence: 238.. 648 



1 11 21 

1 11 
GGCGGGTTCG CGCCCOGAAG GCTGAGAGCT 
CGGAGCTCCG CGGCXX5GACC CCGOGGCCCC 
AAACTCTCCT GCGCCCCAGA AGATTTCTTC 
GGCAGGAAGA GAAGGCGCTT TCTGTCTGCC 
TTCCTCTCCA TCCTAGTGGC GCTGTGCCT6 
GCQCCCTGCG AGGCGGTGCG CATCCCTATG 
ATGCCCAACC ACXTTGCACCA CAGCACGCAG 
GAGGAGCTGG TGGACGTGAA CTGCAGCGCC 
GCGCCCATTT GCACCCTGOA QTTCCTGCAC 
CAA0G06CGC GG6A0GACT6 GQA6CCCCTC 
AGCCTGGCXrr GCGACGAGCT GCCTGTCTAT 
ATCGTCACGG ACCTCCCGOA GGATGTTAAG 
CAGGAAAGGC CTCTTGATGT T6ACTGTAAA 
AACKTTGAAOC CAACTTtCOC AAC3GTATCIC 
AAAATAAAAG CTGT6CAGAG 6AGT66CTGC 
GAGATCTTCA AGTCCTCATC ACCCATCCCT 
TCTTGCCAGT GTCCACACAT CCTGCCCCAT 
C6TTCAAGGA TGATGCTTCT T6AAAATTGC 
AAAAGATCCA TACAGTGGGA AGA6A0GCT6 
AAGAAAACAG CCGGGCGCAC C»GTC!GTAGT 
GCTCCCAAAC CAGCCAGTCC CAAGAAGAAC 
.AACCCGAAAA GAGTGTGAGC TAACTAGTTT 
GATGAGGCTG GGCATTGCCT GGGACAGCCT 



31 41 51 

I I 1 

GGCGCTGCTC GTGCCCTGTG TGCCAGACGG 60 

GCTTTGCTGC CGACTGGAGT TTGQGGGAAG 120 

CTCGGOGAAO GQACAGOGAA AGATGAGGGT 180 

GGGOTGGCAa OQCGAOAGGG CAOTGCCATG 240 

T66CT6CACC T66G6CT6G6 GGT6GGC6GC 300 

TGCCGGCACA TGCCCTGGAA CATCACGCGG 360 

GAC5AA0GCCA TCCTGGCCRT CGAGCAGTAC 420 

GTGCTGCGCT TCTTCTTCTG TGCCATGTAC 480 

GAGCCTATCA AGCCGTGCAA GT0G0TQT6C 540 

ATGAAGATGT ACAACCACAG CTG6CCC6AA 600 

GACCGTGGCG TGTGCATTTC GCCTGAAGCC 660 

TGGATAGACA TCACACCAGA CATGATGGTA 720 

C6CCTAA6CC CCGATGGGTG CAAGT6TAAA 780 

A6CAAAAACT ACAGGTATOT TATTCATGOC 840 

AAT6AG6TCA CAACGGTGGT G^TGTAAAA 900 

CX3AACTCAAG TCXXX3CTCAT TACAAATTCT 960 

CAAGATGTTC TCATCATGTG TTACGAGTGG 1020 

TTAGTTGAAA AATGGAGAGA TCAGCTTA6T 1080 

CAGGAACAGC GQAOAACAGT TCAGGACAAQ 1140 

AATCXrCCCA AACCAAAG6G AAAGCCTCCT 1200 

ATTAAAACTA GGAGTGCCCA GAAGAGAACA 1260 

CCAAAGCGGA GACTTCXX3AC TTCCTTACAG 1320 

ATGTAAGGCC AT6T6GCCCT TGCCCTAACA 1380 
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ACTCACTGCA. GT6CTCTTCA TAGACACATC TTGCAGCATT TTTCTTAAGG CTATGCTTCA 14 40 

U ' lTT ' l ' l ' LTl'r GTAAGCCATC ACAAGCCATA GTGGTAGQTT TGCCCTTTGG TACAGAAQGT 1500 

6AGTTAAA6C TOGTGGAAAA GGCTTATTGC ATTGCATTCA GAGTAACCT6 TGTGCATACT 1560 

CrA(5AA0AGT AGGGAAAATA ATGCTTGTTA CAATTOtSACC TAATATGTGC ATTGTAAAAT 1620 

AAATGCCATA TTTCAAACAA AACAC6TAAT TTTTTTACAG TATGTTTTAT TACCTTTTGA 1680 

TATCTGTTGT TGCAATGTTA GTGATGTTTT AAAAT6TGAT GAAAATATAA TGTTTTTAAG 1740 

AAGOAACAOT AGIOGAATQA ATGTTAAAAG ATCTTTATGT GTTTATGOTC TGCAGAAGGA 1800 

TTTTTCTGAT GAAAGGGGAT TTTTTGAAAA ATTAGAQAAG TAGCATATGG AAAATTATAA 1860 

TGTGTTTTTT TACCAATGAC TTCAGTTTCT GTTTTTAQCT AGAAACTTAA AAACAAAAAT 1920 

AATAATAAAG AAAAATAAAT AAAAAGGAGA GQCA6ACAAT GTCT6GATTC CTGXTT7TTG 1980 

GTTACCTGAT TTCCATGATC ATGATGCTTC TTGTCAACAC CCTCTTAAGC AGCACCAGAA 2040 

ACAOTQAGTT TGTCTQTACC ATTAG6AGTT AGGTACTAAT TAGTTG6CTA ATGCTCAAGT 2100 

ATTTTATACC CACAAOAGAG GTATGTCACT CATCTTACTT CCCAGGACAT CCACCCTGAG 2160 

AATAATTTQA CAAGCTTAAA AATGGCCTTC AT6TGAGTGC CAAATTTTOT TTTTCTTCAT 2220 

TTAAATATTT TCTTTGCCTA AATACATGTG AOAGGAGTTA AAXATAAATQ TAGAOAGAGO 2280 

AAAGTTGAGT TCCACCTCTG AAATGAGAAT TACTTOACAG TTGGQATACT TTAATCAGAA 2340 

AAAAAGAACT TATTTGCAGC ATTTTATCAA CAAATTTCAT AATTGTGGAC AATTGGAGGC 2400 

ATTTATTTTA AAAAACAATT TTATTGGCCT TTTGCTAACA CAGTAAGCAT GTATTTTATA 2460 

AGGCATTCAA TAAATGCACA ACGCCCAAA6 GAAATAAAAT CCTATCTAAT CCTAC TCTCC 2520 

ACTACACAGA GGTAATCACT ATTAGTATTT TGGCATATTA TTCTCCAOGT GTTTGCTTAT 2580 

GCACTTATAA AATGATTTGA ACAAATAAAA CTAGGAACCT GTATACATGT GTTTCATAAC 2640 

CTGCCTCCTT TGCTTGGCCC TTTATTGAGA TAAGTTTTCC TGTCAAGAAA GCAGAAACCA 2700 

TCTCATTTCT AACA6CTGTG TTATATTCCA TAGTATGCAT TACTCAACAA ACTGTTGTGC 2760 
TATTGGATAC TTAGGTGGTT TCTTCACTGA CAATACTGAA TAAACATCTC ACOGGAATTC 

Seq ID KO: 423 Protein sequence 
Protein Accesslcm I: KP_003005.1. 

1 11 21 31 41 51 

I I I I I I 

MFLSILVALC LWIiHLALGVR GAPCEAVRIP MCRHMPNHIT RMFNHLHHST QSIAILAIEQ 60 

YBELVDVNCS AVUIPPFCAM YAPICTLBPL HDPIKPCKSV CQRARDDCEP LMKMYNHSWP 120 

ESLACDELPV YDRGVCISPE AIVTDLPEDV KWIDITPDMM VQERPLDVDC KRLSPDRCKC 180 

KKVKPTLATY LSKNYSYVIH AKIKAVQRS6 CaJBVTTWDV KEIPKSSSPI PRTQVPLITN 240 

SSOQCPHXIiP HODVLZMCYB HRSRMMZiLEV CLVBKHRDQL SXR8ZQHEER LOEQRRTVOD 300 
KKKTAGRT8R SNPPRPKOKP PAPKPASPXK NZKTRSAQKR TNPKRV 

Seq ID NOt 424 DMA sequence 
Nucleic Acid Accession ft: BC010423 
Coding sequence: 248.. 1780 

1 11 21 31 41 51 

I I I I I I 

CACAGCGTGG 6AAGCAGCTC TGGGGGAGCT OGGAGCTCCC 6ATCA0GGCT TCTTGGGGGT 60 

AGCTAG6GCT GGOrGTOTAO AACGGGQC08 GGQCTGGGGC TGGGTCCCXrr AGTGGAGACC 120 

CAA0TGC3GAQ AGGCAAGAAC TCTOCAOCTT CCTOCCTTCT GGGTCAGTTC CTTATTCAAG 180 

TCTGCAGCCG GCTCCCAGGG AGATCTOGGT GGAACTTCAG AAAC6CTGGG CAGTCTGCCT 240 

TTCAACCATG CCCCTGTCCC TGGGAGCOGA GATGTGGGGG CCTGAGGCCT GGCTGCTGCT 300 

GCTGCTACTG CT66CATCAT TTACAGGCOG GTGCCCOGCG GGTGAGCTGG AGACCTCAGA 360 

CGTGGTAACT 6TGGTGCTGG GCCAGQACX3C AAAACTGCCC TGCTTCTACC GAGGGGACTC 420 

GGGOQAGCAA GTGGG6CAAG T06CATG6GC TCGGGTGGAC GGGGGCGAAO G06CCCAGGA 480 

ACTAGCGCTA CTGCACTCCA AATAC06GCT TCATGTGAGC CC3GGCrTACG AGGGCaSOGT 540 

GGAGCAGCCG COGCCX:CCAC GCAACCCCCT GGACGGCTCA QTGCTCCTGC GCAAOSCAGT 600 

GCAGGCGGAT GAGGGOSAGT ACXSAGTGCOG GGTCAGCACC TTCCCCGCOG GCAGCTTCCA 660 

GGGGC66CIG OGGCTCaSAO TGCTGGTGCX: TCCCCTOCCC TCACTGAATC CTGGTCCAGC 720 

ACTA8AAGAG OGCCBGGGCC TGACCCTGGC AGCCTCCTGC ACAGCTGAGG GCAGCCCAGC 780 

CCCX3W5CGTG ACCTGQGACA CGGAGGTCAA AGGCACAACG TCCAGCCGTT CCTTCAAGCA 840 

CTCCCGCTCT GCTGCCGTCA CXTrCAGAGTT CCACTTGGTG CCTAQCCX3CA 6CATQAATGG 900 

GCAGCCACTG ACTTGTQTGQ TOTCCX»TCC TOiSCCTGCTC CAGGACCAAA GGATCACCCA 960 

CATCCTCCAC GTGTCCTTCC TTGCTGAGGC CTCTGTGAGG GGCCTTGAAG ACCAAAATCT 1020 

GTGQCACATT GGCAGAGAAG GAGCTATGCT CAAGTGCCTG AGTGAAGGGC AGCCCCCTCC 1080 

CTCATACAAC TGGACAOGGC TGGATGGGCC TCTGCXKAGT GGGGTACGAG TGQATGGGGA 1140 

CACTTTGGGC TTTCCCCCAC TGACCACTGA GCACAGCGGC ATCTAOGTCT GCCATOTCAG 1200 

CAATQAGTTC TCCTCAAGGG ATTCrCAGQT CACTGTQGAT GTTCTTGACC CCC3U3GAAGA 1260 

CTCTGGGAAG CAGGTGGACC TAGTGTCAQC CTOGGTQGTG GTGGTGGGTG TGATCGCCGC 1320 

ACTCTTGTTC TGCCTTCT6G TGGTGGTGQT GGTGCTCATG TCCCGATACC ATOGGOGCAA 1380 

GGCCCAQCAG ATGACCCAGA AATATOAGQA GGAGCTOACC CTGACCA666 AGAAC TCCAT 1440 

CCX3GAGGCTG CATTCCCATC ACACG6ACCC CAG6A6CCAG COOGAOGAGA GTGTAGG6CT 1500 

GAGAGCOQAG GGCCACCCTG ATAGTCTCAA GGACAACAGT A6CTGCTCT6 TGATGAGTGA 1560 

AGAGCGOGAG GGCOaCAGTT ACTCCAOGCT GACCAOGGTG AGG6AGATAG AAACACAGAC 1620* 

TGAACTGCTG TCTCCAGGCT CTGOGCGGQC OGAGGAGGAG GAAGATCAGG ATGAAGGCAT 1680 

CAAACAGGCC ATGAACCATT TTGTTCAGGA GAATGGGACC CTACGGGCCA A6CCCACGGG 1740 

CAATG6CATC TACATCAATO G6088G6ACA CXTGGTCT6A 0CCAQ6CCTG CCTCCCTTCC 1800 

CTAOGCCTCO CTCCtTCTGT TGACATGGQA GArTTTAGCT CATCTTGGG6 GCCT CCTTAA 1860 

ACACCCCCAT TTCTTGOGGA AGATGCTCCC CATCCCACTG ACTGCTTGAC CTTTACCTCC 1920 

AACCCTTCTQ TTCATCQGGA GGGCTCCACC AATTGAGTCT CTCCCACCAT 6CATGCAGGT 1980 

CACTGTGTGT GTGCATGTGT GCCTGTGT6A GTGTTGACTG ACTGTOTGTG TGTGGAGGGG 2040 

TQACTGTCGG TG6AGGGGTG ACTGT6TG0G TGGTGTQTAT TATGCTGTCA TA TCAG AGTC 2100 

AAGIGAACTG TGGTGTATGT GCXaCXSGGAT TTGAGTOGTT GOGTGQGCAA CACT gTCAG Q 2160 

GTTT6G06TG T6TGTCATGT GGCTGTGT6T OACX^TCTGCC TGAAAAAGCA GGTATTTTCT 2220 

CAGACCCCAG AGCAGTATTA ATGATGCACSA GGTTGGAGGA GAGAGGTGGA GACTGTGGCT 2280 

CAGACCCAGG TGTGOGGGCA TAGCTGQAGC TGGAATCTGC CTCCGGTGTO AGGGAACCTQ 2340 

TCTCCTACCA CTTCGGAGCC ATGGOGGCAA GTGT6AAGGA GOCRGTCCCT GGGTCAGCX31 2400 

GAGGCTTGAA CTGTTACAOA AGCC C TCTGC CCKTGGTGG CCTCTGGGOC TGCTOCATGT 2460 

ACATATTTTC TGTAAATATA CATGCGCOQG GAGCTTCTTG CAGGAATACT GCTCCGAATC 2520 

ACTTTTAATT TTTTTCTTTT TTTTTTCTTG CCCTTTCCAT TAGTTGTATT TTTTATTTAT 2580 

TTTTATTTTT ATTTTTTTTT AGA6TTTGAG TCCAQCCTGG AOGATATAGC CA6ACCCTGT 2640 
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PCT/US02/12476 



CT6TAAAAAA ACCAAAACCC AAAAAAAAAA 



AAAAAAAAAA 



5 



Seq ID NOs 425 Probein sequence 
Protein Accession AAH10423 



1 



11 



21 



31 



41 



51 



MPLSLGAEMH GPEAHUiLLL LLASFTGRCP AGBLBTSDW TWLGQDAKIi PCFYRGDSGE 60 

QVGQVAWARV DAGEGAQELA UflSKYGLHV SPAYEGRVEQ PPPPRNPLD6 SVUJiNAVQA 120 

10 DEGBYECRVS TPPAGSFQtAR LRLRVLVPPL PSLNPGPALE EGQGLTIAAS CTAEGSPAPS 180 

VTWDTEVRST TSSRSPKHSR SAAVTSEPHL VPSRSMNGQP LTCWSHPGL LQDQRITHIL 240 

HVSPLAEASV ROLBDQNLWH IGRBGAMLKC IiSEGQPPPSY NWTRIiDGPLP SGVRVDGDTL 300 

GPPPLTTEHS GiyVCHVSNK FSSFDSQVTV DVLDPQEDSG KQVDLVSASV WVGVIAALli 360 

FCLLWWVL MSRYHRRKAQ QMTQKYEKEL TLTREKSIRR LHSiOiTDPRS QPEESVGLRA 420 

IS EGHPDSLKON SSC8VMSEEP EGRSYSTLTT VREIETQTSL LSPGSGRAEB EEDQDBGZKQ 480 
AMNHPVQBNG TLRAKPTGNG lYINGRGHLV 

Seq ID NO: 426 DNA sequence 
Nucleic Acid Accession #t im_003474.2 
20 Coding sequence t 37.. 303 6 

1 11 21 31 41 51 

I I I I 1.1 

CACTAAC3GCT CTTCCTAGTC CCX:GGGCCAA CTGGGACAGT TTGCTCATTT ATTGCAACGG 60 

25 TCAAGGCTGG CTTGTGCCAG AACGQOGOGC GCX30GACQCA CGCACACACA OGOGGqaAAA 120 

CTTTTTTAAA AATGAAAGGC TAGAAGA6CT CAGC660GGC GCGGGCCX3T6 CGCGAGGGCT 180 

CGGGAGCTGA CTOGCCGAGG CAGGAAATCC CTCCGGTCGC GA0GCCXX3GC CCCGCTCGGC 240 

GCCCGOGTGG GATGGTGCAG CXSCTOGCOGC CX3GGC0CQAG AGCTGCTGCA CTGAAQGCCG 300 

GCGAOGATGG CAQ0GC6CCC GCT6CCCGTG TCCCCOGCOC G CGCC CTCCT GCTC3GCCCTG 360 

30 GGGG6T6CTC TGCTGQOGCC CTGGQAGGCC CQAGQGGTGA GCTTATGGAA CGAAGGAAOA 420 

GCTGATGAAG TT6TCAQTOC CTCTGTTC6Q AGTGGGGACC TCTGOATCCC AQTGAAGAGC 480 

TTCQACTCCA AGAATCATCC AGAAGTGCTG AATATTCQAC TACAAOGGGA AAGCAAAGAA 540 

CTGATCATAA ATCTGGAAAG AAATGAAQGT CTCATTGCCA GCAGTTTCAC GGAAACCCAC 600 

TATCTGCAAG ACGGTACTGA TGTCTCCCTC GCT06AAATT ACACXSGTAAT TCTGGQTCAC 660 

35 TGTTACIACC ATGGRCATaT AOGGGGATAT TCTGATTCAG CAGTCaGTCT CROCAOGTQT 720 

TCTGOTCTCA GOGQACTTAT TGTOTTTGAA AATOAAAGCT ATGTCTTAOA ACCAATOAAA 780 

AGTGCAACXA ACAGATACAA ACTCTTCCCA GCGAAGAAGC TGAAAAGCQT CXX3QGGATCA 840 

TGTOGATCAC ATCACAACAC ACCAAACCTC GCTGCAAAGA ATGTGTTTCC ACCACCCTCT 900 

CAGACATGGG C»AGAAGGCA TAAAAGAGAG ACCXTTCAAGG C3ACTAAGTA TGTGGAGCT6 960 

40 GTGATOGTGG CAOACAACOG AGAGTTTCftS AGQCftAGGAA AAOATCIOGA AAAAGTTAAG 1020 

CAGCGATTAA TAGAGATTGC TAATCAOGTT GACAAGTTTT ACAGACCACT GAACATTOQO 1080 

ATCGTGTTGG TAGGOGTGGA AGTGTGGAAT GACATGGACA AATGCTCTGT AAGTCAGGAC 1140 

CCATTCACXA GCCTCCATGA ATTTCTGGAC TGGAGGAAGA TGAAGCTTCT ACCTCX5CAAA 1200 

TCCCATGACA ATGCXSCAGCT TGTCAGTQGG 6TTTATTTCC AAGGGACCAC CATCGGCATG 1260 

45 OCCCCAATCA TQRGCATGTG CACGOCS^GAC CAGTCTGGGG GAATTGTCAT GGACCATTCA 1320 

GACAATCCCC TTGGTOCAflC COTOAOCCTG GCACAT6AGC TGGGCCACAA TTTCGGGATG 1380 

AATCATGACA CACTGGACAG GGGCTGTAGC TGTCAAATGG GGGTTGA6AA AGGAGGCTGC 1440 

ATCATGAACQ CTTCCACCGG GTACCCATTT CCCATGGTGT TCAGCAGTTG CAGCAGQAAG 1500 

6ACTTGQAGA CCAGCCTGGA GAAAG6AAT6 GGGGTGTGCC T6TTTAACCT GOCQGAAQTC 1560 

50 AGGGAOTCTT TCQGGGGCCA GAAOTOTOGG AACAGATTTG TGGAAGAAGG AQAGGAGTGT 1620 

GACnSTGGGG AGCCAQAGGA ATGTATGAAT 0GCT8CT6CA ATOCCACCAC CTGTACCCTG 1680 

AAGCCGGACG CTGTGTGCGC ACATGGGCTG TGCTGTGAA6 ACTGCCAGCT GAAGCCTGCA 1740 

GGAACAGOGT GCAG6GACTC CAGCAACTCX: TQTGACCTCC CAGAGTTCTG CACAGGGGCC 1800 

AGCCCTCACT GCCCAGCCAA OGTGTACCTG CACGATGGGC ACTCATGTCA GGATGTGGAC 1860 

55 6GCTACTCCT ACAATGQC3W CTOCCAGACT CACQAGCAGC AGTGTGTCAC ACTCTGGGGA 1920 

CCAGGTGCTA AACCT6CCCC TOGGATCTGC TTTGAGAGAG TCAATTCTGC AGGTGATCCT 1980 

TATGGCAACT GTG6CAAAGT CTCGAAGAGT TCCTTTGCCA AATGCGAGAT GA6AGATGCT 2040 

AAATGTGGAA AAATCCAGTG TCAAGGAGGT GCCAGCCGGC CAGTCATTGG TAOCAATGCC 2100 

GTTTCCATAQ AAACAAACAT CCCCCTGCAG CAAGGAGGCC GGATTCTGTG CCGGG6GACC 2160 

60 CACOTOTACT TOGGCGATOA CATGCOGGAC CCAGGGCTTG TGCTTGCAGG CACAAAGTGT 2220 

GCAQATGQAA AAATCTQCCT GAATCGTCAA TGTCAAAATA TTAGTGTCTT TGGGGTTCAC 2280 

GAGTGTGCAA TGOWSTGCCA CGGCAGAGGG GTGTGCAACA ACAGGAAOAA CTGCCACTGC 2340 

GAGGCCCACT GGGCACCTCC CTTCTGTGAC AAGTTTGGCT TTGGAQGAAG CftCAGACAGC 2400 

GGCCCCATCC GGCAAGCAGA TAACCAAGGT TTAACCATAO GAATTCTGGT GACCATCCTG 2460 

65 TGTCTTCTTG CTGCCGGATT TGTGGTTTAT CTCAAAAGGA AGACCTTGAT ACGACTGCTG 2520 

TTTACAAATA AGAAGACCAC CATTGAAAAA CTAAGGTGTG TGCGCCCTTC CCGGCCACCC 2580 

CGTGGCTTCC AACCCTGTCA GOCTCACCTC GGCCACCTTG GAAAAGGCCT GATGAGGAA6 2640 

CCX3CCAGATT CCTACCCACC GAAGGACAAT CCCAOGAGAT TGCT6CAGT6 TCAGAATGTT 2700 

GACATCAGCA GACCCCTCAA CGGCCTGAAT GTCCCTCAGC CCCA6TCAAC TCRGCGAGTG 2760 

70 CTTCCTCCCC TCCACCQGGC CCCACGTGCA CCTAGCGTCC CTGCCAGACC CCTGCCAGCC 2820 

AAGCCTGCAC TTAGGCAQGC CCAGGGGACC TGTAAGCCAA ACCCCCCTCA GAAGCCTCTG 2880 

CCTGCAGATC CTCTGGCC3W3 AAC3ACTCGG CTCACTCATG CCTTGGCCAG GACCCCAGGA 2940 

CAATGG6AGA CTGGGCTCOG CCTGGCACCC CTCAGACCT6 CT<XACAATA TCCA CACCAA 3000 

GTGCCCA6AT CCACCCACAC CGCCTATATT AA6T6AGAAG CCQACAC CTT TTTTCAACAG 3060 

75 TGAAGACAGA AGTTTGCACT ATCTTTCAGC TCCAGTTOGA GTTTTTTGTA CCAACTTTTA 3120 

GGATTTTTTT TAATGTTTAA AACATCATTA CTATAAGAAC TTTGAGCTAC TGCCGTCAGT 3180 

GCTGTGCTGT GCTATGGTGC TCTGTCTACT TGCACAGGTA CTTGTAAATT ATTA ATTTAT 3240 

GCAGAATGTT GATtACAGTG CAGTGOOCTS TAGXAGGCaT TTTTACCATC ACIGAGTTTT 3300 

CCATGGCAGG AAGGCTTGTT GTGCTTTTAQ TATTTTAGTG AACTTGAAAT ATCCTGCTTG 3360 

80 ATGGQATTCT GGACAGGATO TGTTTGCTTT CTGATCAAG6 CCTTATTGGA AAGCAGTCXX; 3420 

CCAACTACCC CCAGCTGTGC TTATGGTACC AGATGCAGCT CAAGAGATCC CAAGTAGAAT 3480 

CrfCAGTTGAT TTTCTGQATT CCCCATCTCA GGCCAGAGCC AAGGGGCTTC AGGTCCAGQC 3540 

TQTGTTTQOC TTTCAGGGAG GCCCTGTGCC CCTEGACAAC TGGCA6QCAG GCTCCCAGGG 3600 

AGACCTCGOA GAAATCTGGC TTCTGGCCAG GAAGCTTTGO T6AGAACCTG GGTTGCAGAC 3660 

85 AGGAATCTTA AGGTGTA6CC ACACCAGGAT AQAGACTGGA ACACTAGACA AGCCAGAACT 3720 

TGACCCTGAG CTGACCAGCC GTGAGCATGT TTGGAAGGGG TCT6TAGTGT CACTCAAGGC 3780 

GGTGCTTGAT AGAAATGCCA AGCACTTCTT TTTCTOGCTG TCCTTTCTAQ AGCACTGCCA 3840 
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CaWSTAGGTT ATTTAGCTTG GOAAAGGTGG TGTTTCTGTA AGAAACX^-AC TGCCCA6GCA 3900 

CTGCAAACCXS CCMXTOCCT ATACTGCTTG GAGCTGABGA AATCACCACA AACTCTAATA 3960 

CAATGATCCT GTATTCAQAC AGATORGGAC TTTCCATGGG ACCACAACTA TTTTCAGATQ 4020 

TGAACCATTA ACCAQATCTA 6TCAATCAAG TCTGTTTACT GCAAGGTTCA ACTTATTAAC 4080 

AATTAGGCAG ACTCTTTATG CTTGCAAAAA CTACAACCAA TGGAATOTGA TGTTCATGGG 4140 

TATAGTTCAT OTCTGCTATC ATTATTCGTA GATATTG6AC AAAGAACCTT CTCTATGGGG 4200 

CATOCTCTTT TTCCAACTTO GCXGCAGGAA TCTTTAAAA6 ATGCTTTTAA CAGAGTCTGA 4260 

ACCTATTTCT TAAACACTTQ CAACCTACCT GTTOAaCATC ACRGAATGTG ATAAGGAAAT 4320 

CAACTTGCTT ATCAACTTCC TAAATATTAT GAGATGTGGC TTGGGCAGCA TCCCCTTGAA 4380 

CTCTTCACTC TTCAAATGCC TGACTAGGOA GCCATGTTTC ACAAGGTCTT TAAAGTOACT 4440 

AATGGCATGA GAAATACAAA AATACTCAGA TAAGCTAAAA TGCCATGATG CCTCTGTCTT 4500 

CTQGACTGGT TTTCRCATTA GAAGACAATT GACAACAGTT AC ATAA TTCA CTCTQAGTGT 4560 

TTTATQAGAA AGOCTTCTTT TGGGGTCAAC AGtTTTCCTA T6CTTTQAAA CAQAAAAATA 4620 

TCTACCAA6A ATCTTGGTTT GCCTTCCAGA AAACAAAACT GCATTTCACT TTCXTCGGTOT 4680 

TCCCCACTGT ATCTAG6CAA CATAGTATTC AT6ACTATGG ATAAACTAAA CACX3TGACRC 4740 

AAACACACAC AAAAGQGAAC CCAGCTCTAA TACATTCCAA CTCGTATAGC ATGCATCT6T 4800 

TTATTCTATA GTTATTAAOI TCTTTAAAAT GTAAAGCCyVT GCTGGAAAAT AATACTGCTG 4860 

AOATACATAC AOAATTACTG TAACTQATTA CACTTGGTAA TTGTACTAAA GCCAAACATA 4920 

TATATACTAT TAAAAAGGTT TACAGAATTT TATGGTGCAT TAOSTGGGCA TrGTCrrTTT 4980 

AOATGCCCAA ATCCTTAGAT CTGGCATOTT A6CCCTTCCT CCAATTATAA OAGGATATGA 5040 
ACCftAAAAAA AAAAAAAAAA AA 



Seq ID NO I 427 Protein sequence 
Protein Accession #: NP_003465 

1 11 21 31 41 51 

I I i I ( i 

MftARPLPVSP ARALLLALAG ALliAPCEARG VSLWUBGRAD BWSASVRSG DLWIPVKSFD 60 

SKNHPBVLMI RLQRESKELI INLERNEGLI ASSFTETHyL QDGTDVSIiAR NYTVIIiGHCy 120 

YHGHVRGYSD SAVSLSTCSG LRGLIVFENE SYVLBPHRSA TNBXXLFFAK KLKSVROSOS 180 

SHHNTPNLAA KNVPPPPSQT WARRHKRETL KATK5fVBLVI VADNRBFQRQ GXDLBKVKQR 240 

LIEIANHVDK FYRPUIIRIV LVGVEVWNDM DKCSVSQDPP TSLHEFLDWR KMKLLPRKSH 300 

DNAQLVSGVY FQGTTIOflAP IMSMCTADQS GGIVMDHSDN PLGAAVTLAH BLGHNPfflOJH 360 

DTLDRGCSCQ MAVEKGGCIM NASTOYPPPM VPSSCSRKDL BTSLEKGMGV OjEHLPEVRE 420 

SPGGQKCGNR FVEBGEECDC GEPEECMHRC CNATTCTLRF DAVCAHCKiCC EDGQUCPAGT 480 

ACRDSSNSCD LPEFCTGASP HCPAMVyZiRD GHSOODVDGY CyNGICQTHE QQCVTI.WGPG 540 

AKPAPGICFE RVNSAGDPYG MOGKVSKSSF AKC31RDAKC GKIQCQGOAS RPVIGTNAVS 600 

IBTNIPLQQG GRILCR6THV YLGDDMPDPG LVLAGTKCAD GKICLNRQCQ NISVFGVHEC 660 

AMQCHORQVC MNRKNCHCEA HWAPPFCDKF GFGGSTDSGP IRQADNQGLT IGILVTILCL 720 

LAAfiFWVLK RKTLIRIJ.FT HKKTTIBKIiR CWRPSRPERG FQPOQAHLGH LiOTGLMRKPP 780 

DSYFPKDNPR RLLQOQNVDX SRFLNGUIVP QPQSTQRVLP PIiKRAPRAPS VPARPIiPAKP 840 

ALROAQGTCaC PNPPQKPLPA DPIiARTTHLT HALARTPGQW EIGLRLAPLR PAPQYPHQVP 900 
RSTBTAyiR 



Seq ZD NO I 428 DNA sequence 
Nucleic Acid Acceasion «t NM_003714 
Coding sequence t 135.. 1043 

1 11 21 31 41 51 

GAGGA6GAG6 GAAAAGGGOA GCAAAAAGGA AGA6TGGGAG GAG6AG6GGA A6CGGC6AAG 60 

GA6GAAGAGG AGGAGGAGGA AQAGGGQAGC ACAAAGGATC CAGGTCTCCC G ACGG GAGGT 120 

TAATACCAAG AACCATGTGT GCCGAGCGGC TGGGCCAGTT CATGACCX^G GCTTTGGTGT 180 

TGGCCACCTT TGACCC36GC6 CGGOOGACCG ACGCCACCAA CCCACCCGAG GGTCCCCAAG 240 

ACAGGAGCTC CCaWSCAGAAA GGCCGCCTGT CXXTrOCAOAA TA CAGC GGAG ATCCAGCACT 300 

GTTTGGTCAA OSCTGGOQAT GTGGGQTGTG GOGTGTTTGA ATGTTTCGAG AACAACTCTT 360 

GTGAGATTC30 GG6CTTACAT GGGATTTGCA TQACTTTTCT GCACAACGCT GGAAAATTTG 420 

ATGCCCAGGG CAAGTCATTC ATCAAAGACG CCTTQAAATG TAAGGCCCAC GCTCTGCGGC 480 

ACAGGTTOGG CTGCATAAGC CGGAAGTGCC OGGCXZATCAG GGAAATGGTG TCCXAGTTGC 540 

ASGGGGAATC CTAOCTCAAO CAC5SACCTOT OC3Gm3CTGC OCAGGAGAAC ACCCX3GQTGA 600 

TAGTGGAGAT GATCCATTTC AAGGACTTQC TOCTOCAOGA ACCCTACGTG GACCTCGTGA 660 

ACTTGCTGCT GACCTGTGQO GAGGAGGTGA AGGAGGCCAT CACCCACAGC GTGCAGGTTC 720 

AGItSTGAGCA GAACTGGGGA AGCCTGTGCT CCATCTTGAG CTTCTGCACC TCG6CCATCC 780 

AGAAGCCTCC CACGGCOCCC CCCGAGCGCC AGCCCCAGGT GGACAQAACC AAGCTCTCCA 840 

GGGCCCAOCA GGGG6AAGCA GGACATCACC TCCC3VGAGCC CAGCAGTAQG GAGACTGGCC 900 

GAGGTGOCAA GGGTSftGCXSA 6GTAGCAA6A GCCACCCAAA CGCCCATGCC CGAGGCAGAG 960 

TCGGGGGCCT TGGGGCTCAG GGACCTTCC3G GAAGCAGCGA GTGGGAAGAC GAACAGTCTG 1020 

AGTATTCTGA TATCCGGAGG TGAAATGAAA QGCCTGGCCA OGAAATCTTT CCTCCAOQCC 1080 

6TCCATTTTC TTATCTATGG ACATTCCAAA ACATTTACCA TTAGAGAGGG GGGATGTCAC 1140 

ACGGAQQATT CTGTGGGGAC TGTGGACTTC ATCGAGGTOT GTGTTCGOGG AACX3GACAGG 1200 

TOAGATGCAO ACCCCTGGGG CCGTGGGGTC TCAQGQGTGC CTGGTG AATT CTGCACTTAC 1260 

AC3GTACTCRA GGGAGCGCQC CCGCGTTATC CTCGTACCTT TGTCTTCTTT CCATCTGTGQ 1320 

AGTCAGTGGG TGTCX3GCCGC TCTGTTGTGG GGGAGGTGAA CCAOGaAGGQ QC aGGGCAAO 1380 

GCAGGGCCCC CAGAGCTGGG CCAC3VCAGTG GGTGCTGGGC CTCGCCCC3GA AGCTTCTGQT 1440 

GCftGCAOCCT CTGGTGCTGT CTCGGOSGAA GTCAGGGCX3G CTGQATTCCA GQACAGQAGT 1500 

GAATCTAAAA ATAAATATCQ CTTAGAATGC AGGAGAAGGG TGGAGAGGAG GCAGGGGCCG 1560 

AGGGGGTGCT TGGTGCCAAA CTGAAATTCA GTTTCTT6TG TGfiGGCCTTQ 0GGTTCAGA6 1620 

CTCTTGGCQA GGGTGQAGGG AGGAGTGTCA TTTC3TMGTG TAATT TCTG A G CCATTG TAC 1680 

TGTCTGGGCT GGGGGGGACA CTGTCCAAGG GAGTGGCCCC TATGAGTTTA TATTTTAACC 1740 

ACTGCTTCAA ATCTCGATTT CACTTTrrTT ATTTATCCAG TTATATCTAC ATATCTGTCA IBOO 

TCTAAATAAA TGGCTTTCAA ACAAAGCAAC TGGGTCATTA AAACC3U3CTC AAAGQGGGTT 1B60 

TAAAAAAAAA AAAACCAGCC CATCCTTTGA GGCTGATTTT TCTTTTTTrT AAQTTCTATT 1920 

TTAAAAGCTA TCAAACAGOG ACATAGOCAT ACATCTQACT GCCTQACATG GACTC CTGCC 1980 

CACTTGGGGG AAACCTTATA CCCAOAGGAA AATACACACC TG6GGAGTAC ATTTOACSUVA 2040 

TTTCCCTTAG GATTTCOTTA TCTCACCTTG ACCCTCAGCC AAQATTGGTA AAGCTGCGTC 2100 

CTGGCGATCC CAGGAGACCC AGCTGGAAAC CTGGCTTCTC CATGTGAQGG GATGGGAAAG 2160 

GAAAGAAGA6 AATGAAGACT ACTTAGTAAT TCCCATCAGG AAATGCTGAC CTTTTACATA 2220 
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AAATCAA66A GACT6CTGAA AATCTCTAAG GGACA6GATT TTCCA6ATCC TAATT66AAA 
TTTA6CAATA AQ8AC3AGQAG TGCAAGGGGA CAAATAAAGG CAGAGAQAGA QACAGAGAGA 
GGGA6AG6AA GAAAAGAGAG AGAGAAAAGA GCCTC6TGCC 

Seq ID NO: 429 Protein sequence 
Protein Accession #: NP 003705 



PCT/US02/12476 



2280 
2340 



.41 



51 



1 11 21 31 

I I I I I I 

MCAERLGQPM TIALVLATPD PARGTDATNP PEGPQDRSSQ QKGRLSLQNT AEIQHCLVNA 
GDVGCGVFEC FENNSCEIRG LHGICMTPLH NAGKFDAQ6K SFIKDALKCK AHALRHRFGC 
ISRKCPAIRS MVSQIiQRECY LKHDLCAAAQ ENTRVIVEMI RFKDLLLHEP YVDLVNLIiLT 
QGEEVKEAIT HSVQVQCEQN WGSLCSILSP CTSAIQKPPT APPERQPQVD RTKLSRAHHG 
BAGHHLPEPS SRETGRGAKG SRGSKSHPMA KARGRVG6LG AQGPS6S8EN EDEQSEYSDI 
RR 

Seq ID HO: 430 DNA sequence 
Nucleic Acid Accession MM_005940 
Coding sequence: 23.. 1489 



1 
I 

AAGCCCAGCA 
CGCCCTCCTG 
TCTGOCGCOG 
A6CCCT6CCC 
CAGCCTCAGG 
COGACAGAAG 
GATCCTTGGG 
CCTAAAGGTA 
TGACATCATG 
TGGGGGCATC 
CGACTATGAT 
AGCCCATGAA 
GTCC9GCCTTC 
TCAACACCTA 
CCAGGCTG6G 
CTGTGAGGCC 
. GGGCTTTOTG 
T06CCACTGG 
CATTTGGTTC 
CCCCGCACCC 
GGGTCCCGAG 
CAGCACCCGG 
CTCTGAGATC 
CCTCTACTGG 
GGGTCCT6AC 
ATGCCCTCAG 
ATCTTTCTGG 
GGTGQGGTAC 
AQGGACTGTC 
GGGACC06CT 
GTAGCACCAT 
TCCTTCCAGG 
TGAGCAACTG 
ATCTGTCTGC 
GTTCACAGTC 
CAACATACCT 
ATCCTCCAAA 
TTTTTAAACr 



11 

I 

GCCCCGGGGC 
CCCCOGATGC 
GAOGTCCACC 
AGTAGCCCGG 

AGGTTCGTGC 
TTOCCATGGC 
TGGAGCGATG 
AT0GACTT06 
CTGGCCCATG 
GA6ACCTGGA 
TTTGGCCACG 
TACACCTTTC 
TATG6CCA6C 
ATAGACACCA 
TCCTTTGAOG 
TGGOSCCTCC 
CAOGSACTGC 
TTCQ\AG6TG. 
CTCACXXSAGC 
AAGAACAAGA 
CGTGTAGACA 
GAGQCTGCCT 
AAGTTTGACC 
TTCTTTGGCT 
GGGTGCTGAC 
CTQTGGGCAC 
AA0CACCAT6 
TC3USACTGG6 
ATGCAGGTCC 
GGCAGGACTG 
GGCTGGCACT 
GGCTGTAGGG 
CTTCTGGCTG 
AAATGGGGAG 
CAATCXrrGTC 
GCCATTGTAA 
GAGGATTGTC 




CTATOSGGGA 
TGCTGGGGCT 
GCTACCCACT 
CCT6QCX3CAC 
ATGAGATTGC 
CGGTCTCCAC 
GTGGQGGCCA 
CCAOCGCXGT 
CTCAGTACTCj 
TGGGCCTGGT 
TCTACTTCTT 
GTCCCGTGCC 
TCCAG GATGC 
CT6TQAAG0T 
GTGCCGAGCC 
CCCTGCCAGG 
CAGGCATGGG 
ACAACIGCXX3 
CAGGQAGGCT 
TGGCAAACCT 
GGGGAACTGG 
GAAGCAAGGG 
CAGGGCCACT 
ACAATCCTGG 
GGGTATTCTT 
CCAGGCCGGA 
ATGTGTGTAC 
ATTAAACACA 



31 

I 

GGCCGCCTGG 
GCTCCAGCCG 
CGAGAGGAGG 
TGCCAQ3CA6 
CX3ACXXATCT 
GCGCTGGGAG 
GGAGCAGGTG 
CACCTTTACT 
GCATGGGGAC 
CAA6ACTCAC 
TGACCAGGGC 
GCAGCACACA 
GAGTCrcaGC 
TGTCACCTCC 
ACCGCTGGAG 
CATCCGAGGC 
GCTGCAGCCC 
GGAOGCTQCC 
GGTGTACQAC 
GAGGTTCCCX5 
CCGAGGCAGG 
CCGCAGGGCC 
TGATGGCTAT 
GAAGGCTCTG 
TGCXAACACT 
CCACGAATAT 
ACTGAGCCCA 
GGAG6GCCAC 
TTGOCATGAC 

AGTGTCCTTG 
TGCTGGGGCC 
TCCTGAGGTC 
AAATCTGTTC 
CATGCAGGAG 
TCCTCCTGAA 
A6TGTGTATA 
GTTOTTTTCT 



41 

1 

CTCCGCAGCG 
CCGCCGCTGC 
GG6GCACAGC 
GAAGCCCCXX: 
GATGGGCTGA 
AAGACGGACC 
GGGCAGACX3A 
GAGGTGCAG6 
OACCTGCOGT 
CGAGAAGGG6 
ACAGACCTGC 
ACAGCAGCCA 
CCAOATGACT 
AGGACCCCAG 
CCAGACGCCC 
GAGCTCTTTT 
GGCTACCCAG 
TTGGAGGAT6 
QOTGAAAAGC 
GTCCATGCT6 
GACTACTGGC 
ACTGA CTGGA 
GOCTACTTOC 
QAAGGCTTOC 
TTCCTCTGAC 
CAGGCTAGAG 
TGTCTCCTGC 
GCAG8T0GTG 
TTAAGAGGAA 
TCTCATCCCT 
CTGTATCCCT 
CCATG6CCTT 
AGGTCTTGGT 
TCCAGAATCC 
ACCCCAGGGC 
GCCCTTTTCG 
AACCTTCTTC 



SI 

I • 

CGGCCX3CG0G 

TGGCC06GGC 

CCTGGCAT6C 

GGCCTGCCAG 

GTGCCOGCAA 

tCACCTACAG 

TGGCAGAGGC 

AGGGC06TGC 

TTGATGGGCC 

ATGTCCACTT 

TGCAGGTGGC 

AGGCCCTX3AT 

GCAGGG6GQT 

CCCTGGGCCC 

CGCCAGATGC 

TCTTCAAAGC 

CATTGGCCTC 

CCCAGGGCCA 

CAOTCCTGGG 

CCTTGGTCTG 

GTTTCCACCC 

GAGGGGTGCC 

TQOGCGGGGQ 

CCOGTCTOQT 

CATGGCTTGG 

ACCCATGGCC 

AGGGGGATGG 

GTCACCTGCC 

6GGCAGTCTT 

GTCCCTCAGG 

GTTGTGAGGT 

CAGCCCTGGC 

AGGTGCCTGC 

AGGCCAAAAA 

CTGGAGGCTG 

CAGCACT6CT 

TTCTTTTTTT 



Seq ID NO: 431 Protein sequence 
Protein Accession #: NP_005931 



MAPAAWLRSA 
PAPATQEAPR 
LVQBQVRQTM 
PFPKTHREGD 
YPLSLSPDDC 
VSTIRGELFF 
QYWVYDGEKP 
PVPRRATDWR 
ABPANTFL 



11 

I 

AARALLPPML 
PASSLRPPRC 
AEAIiKVWSDV 
VHPDYDETWT 
RGVQHLYGQP 
FKAGFVWRIiR 
VLOPAPLTEL 
6VPSBIDAAP 



21 
1 

LIiLLQPPPIiL 
6VPDPSDGLS 
TPLTFTEVKE 
IGDDQGTDLL 
WPTVTSRTPA 
GGQLQPGYPA 
GLVRPPVHAA 
QDADGYAyPL 



31 

1 

ARALPPDVRH 
ARNRQKSFVL 
(aiADIMIDFA 
QVAAHEFGRV 
LGPQAGIDTN 
LASRHWQGLP 
LVffGPEKNKZ 
RGRLYNXFDP 



41 
I 

LHAERRGPQP 
SG(3LWEKTDL 
RYWHGDDLPF 
LGLQHTTAAK 
EIAPLEPDAP 
SPVDAAFEDA 
YFFRGRDinni 
VXVKALEGFP 



Seq ID NO: 432 DNA Sequence 
Nucleic Acid Accession ft: jsa 
Coding sequence: 202 ..1563 

11 



024022 



60 
120 
180 
240 
300 



60 
120 
IBO 
240 
3O0 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1060 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
IBOO 
1860 
1920 
1980 
2040 
2100 
2160 
2220 



51 

I • 

WHAALPSSPA 
TYRILRFPflQ 
DGPGGIIABA 
ALMSAPYTFR 
PDACEASPDA 
QGHIHFFQGA 
EHPSTRRVDS 
RLVGPDPF6C 



60 
120 
180 
240 
300 
360 
420 
480 



21 31 41 51 

I I I I 1 1 

ACC3QGGCA0C GQAGQGCTGG GGTACTTTOG TTCTTAATTA GGTCATGOCC GTGIGAGCCA 
GGAAAGGGCr GTGTTTATGG QAAGCCAGTA ACACTGTGGC CTACTATCTC TTC0GTGGT6 
CCATCTACAT TTTTGGGACT CGGGAATTAT GAGGTAGAGG TGGAGGCGGA GCCGGATGTC 
AGAGGTCCTG AAATAGTCAC CATGGGGGAA AATGATCCGC CTGCTGTTGA AGCCCCCTTC 
TCATTCCGAT C3GCTTTTTGG CCTTGATGAT TTGAAAATAA GTCCTGTTGC ACCAGAT6CA 



60 
120 

180 
240 
300 
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GATGCTGTTG CTGCACAGRT CCTOTCACTO CTGCCATTOA AGTTTTTTCC AATCATOQTC 360 

ATTGGGATCA TTGCATTGAT ATTAGCACTG OCCATTGOTC TGGQCATOCA CTTCX3ACTGC 420 

TCAQGGAAGT ACAGATGTCG CTCATCCTIT AAOTOTATOG AGCTGATAGC TCGATGTGAC 480 

GGAGTCTCGQ ATTGCAAAOA GaGGGAGOAC GA0TACG6CT GTGTCCG6GT GGGT66TCAG 540 

AAT6CX3QTQC TCCAOGTGTT OlCAGCTGCr TGGTGGAAGA CCATGT6CTC CGATGACTGO 600 

AAGGGTCACT A06CAAATGT TGCCTGTGCC CAACTGG6TT T0CCAA6CTA TGTCSAGTTCA 660 

GATAACCTCA GAQTGAGCTC GCTGGAGGGG CAGTrCOGQG AGGAGTTTGT QTCCATCGAT 720 

CACCTCTTGC CAGATGACAA GGTGACTGCA TTACACCACT CAGTATATGT GAGGGAGGGA 780 

TGTGCCTCK3 GCXZACGTGGT, TACCTTGCAG TGCACAGCCT OTGGTCATAG AAGGG6CTAC B40 

AOCTCA06CA TCGTGGGTGG* AAACATGTCC TTGCTCTGQC AOTGOCCCIQ 6CAGG0CAGC 900 

CTTCAGTTCC AGQGCTACCA CCTOTGOGOO GGCTCTOTCA TCROGCCCCT GTGGATCATC 960 

ACTGCTGCAC ACTGTGTTTA TQACTTGTAC CTCCCXAAGT CATGGACCAT CCAGGTGGGT 1020 

CTAGTTTCCC TGTTGGACAA TCCAGCCCCA TCCCACTTGG TGGAGAAGAT TGTCTACCAC 1080 

A6CAA6TACA AOCCAAAGAO GCTGGGCAAT GACATG6CCC TTATGAAGCT GGCGGGGCCA 1140 

CTCAOGTTCA ATOAAATOAT CCAGCCTGTG TGCCTGCCCA ACTCTGAAGA GAACTTCCCC 1200 

GATGGAAAAO TOTGCTGGAC GTCAGGAT06 6GGGCCACA6 AG6ATGGAGG TGACGCCTCC 1260 

CCTGTCCTGA ACXACGOGGC CGTCCCTTTG ATTTCX»ACA AGATCTGCAA CCACAGGGAC 1320 

GTGTAOGQTQ GCATCATCTC CCCCTCCATG CTCTGOGCGG GCTACCTGAC GGGTGGCQTG 1380 

6ACAGCT6CC A66GGGACAG CX3GG6GGCCC CTG6TGT6TC AAGAGAGOAO GCTGTGGAAG 1440 

TTA0TQ6GAG 08ACCAQCTT TGGCATGGGC T606CA6AGG TGAACAAGCC TG6GGTGTAC 1500 

ACCCGTGTCA CCTCCTTCCT GGACTGGATC CA06AGCAQA TGGAOAQAGA CCTAAAAACC 1560 

TGAAGAGGAA GGGGACAAGT AGCCACCTGA GTTCCTQAGG TGATGAAGAC AGCCCGATCC 1620 

TCCOCTGGAC TCCC3GTGTAG QAACCTGCAC ACGAGCAQAC ACCCTTGGAG CTCTGAGTTC 1680 

OGGCACCAGT AGCAG6CCCG AAAGAGGCAC CCTTCCATCT GATTCCAGCA CAACCTTCAA 1740 

GCT6CTTTTT tffrmwri ' TTTTTGAGOT GGAGTCTCGC TCTGTTGCXX: AGGCTGGAGT IBOO 

GCAGTGGCGA AATCCCTGCT CSWrTGCAGCX: TCCGCTTCCC TGGTTCAAGC GATT CTCTT G 1860 

CCTCAGCTTC CCCAGTAOCT GQGACCACAG GTGCCCGCCA CCACACXTCAA CTAATTTTTG 1920 

TATTTTTAOT AQAOACAGGG TTTCACCATG TTGGCCRGGC TGCTCTCAAA CCCCTGACCT 1980 

CAAATGATGT QCCTGCTTCA GCCTCCCACA GTGCTGG6AT TACAGGCATG GGCCACCAC3G 2040 

OCTAGGCTCA OOCTCCTTTC TGATCTTCAC TAAGAACAAA AGAAGCAGCA ACTTGCAAGG 2100 

60Q6CCTTTC C3CACTGGTCC ATCTGGTTTT CTCTCCA6QQ GTCTTGCAAA ATTCCTGAC3G 2160 

AOATAAGCAG TTATGTGACC TCAOGTGCAA AGCCACCAAC AGCXIACTCAG AAAAOAGGCA 2220 

CCAGCCCAGA AGTGCAGAAC TGCAGTCACT OCACQTTTTC ATCTCTAGGG ACCAGAACCA 2280 

AACCCACCCT TTCTACTTCC AAGACTTATT TTCACATGTG GGOAGGTTAA TCTAGGAAIG 2340 

ACTOGTTTAA GGCCTATTTT CATGATTTCT TTOTAQCATT TGGTGCTIGA CGTATTATTG 2400 

TCCmOATT CCAAATAATA TGTTTCCTTC CCTCAAAAAA AAAAAAAAAA AAAAAAAAAA 2460 



Seg ID KO: 433 Protein sequence 
Protein Accession |IP_076927 

1 11 21 31 41 51 

) 1 i I ) ) 

MGENSPPAVE APFSFRSLFG LDDLKISPVA PDADAVAAQX LSLLPUCFFP IIVIGIIALI 60 

LAIAIGLGIH FDCSGKYRC31 SSPKCIELIA RCDGVSDCKD GEDEYRCVRV GGQNAVLQVP 120 

TAASWKTMCS DDWKQHYANV ACAQLGFPSY VSSDNLRVSS LEGQFREEFV SIDHLLPDDK 180 

VTAIiBHSVYV RBGCASGHW TLQCTACGHR RGYSSRJVGG NMSLIiSQWPW QASLQFQOYH 240 

XiCGGSVITPL HZZXAAHCVy OLYLPKSNTI QVGLVSIiLON FAPSBLVEKI VYHSKYKPKR 300 

USKDIALMKL AGPLTFNEHI QFVdi^NSBB HFPD6KVCWT SGHGATBDOO DASFVUIHAA 360 

VPLXSNKXCN BRDVYGGIZS PSMLCAGYLT GGVDS0QGD8 OGPLVCQSRH IiMKLVGATSF 420 
GIGCAEVNKP GVYTRVTSPL DWIHBQMBRD LKT 

Seq ZD NO: 434 DNA sequence 

Nucleic Acid Accession ftt NM_000493.2 

Coding sequence : 97 . . 213 9 

1 11 21 31 41 51 

111)11 

CACCTTCTGC ACTGCTCATC TGG6CAGAGG AAGCTTCAGA AAGCTGCCAA GOCAC CATC T 60 

CCAGGAACTC CCAGCACGCA GAATCCATCT GAGAATATGC TGCCACAAAT ACCCTTTTTG 120 

CTGCTAQTAT CCTTGAACTT GGTTCATGGA GTGTTTTACG CTGAAOGATA CCAAATGCCC 180 

ACAGGCATAA AAGGCCCACT AGCCAACACC AAGACACAQT TCTTCATTCC CTACACCA.TA 240 

AAGAOTAAAG GTATAGCAGT AAGA6GAGAQ CAAGOTACTC CT6QTCCACC AGGOCCTGCT 300 

GGACCTOSAG GGCACX:CAGG TOCTTCTGGA CCACCAGGAA AA0CA6GCTA CGGAAGTCCT 360 

GGACTCCAAG GAGAGCCAGQ GTTGCCAGGA CCACOGGGAC CATCAGCTGT AGGGAAACCA 420 

GGTGTGOCAG GACTCCCAGG AAAACCAGGA GAGAGA6GAC CATATG6ACC AAAAGGAGAT 480 

GTTGQACCAO CTGQCCTACC AOGACCCCGG GGCCCACCAG GACCACCTGG AATCCCTGGA 540 

006QCTGGAA TTTCTGTGCC AGGAAAACCT GOACAACAGO GACCCACAGG AGOCCCAGGA 600 

CCCAG6QQCT TTOCTGQAQA AAAGGGTGCA CCAGGAGTOC CTGGTATGAA TGGACAGAAA 660 

GQGGAAATGG GATATGGTGC TCCTGGT06T CCAGGTGA6A GGGGTCTTCC AGGCCCTCAG 720 

GGTCCCACAG GACCATCTGG CCCTCCTGGA GTGGGAAAAA QAGQTGAAAA TGGGGTTCCA 780 

GGACAGCCAG GCATCAAAGG TGATAGAGGT TTTCCOGGAG AAATGGGACC AATTGGCCCA 840 

CCAGSrOCCC AAGGOCCTCC TGGGGAAOSA GQ6CCAGAAG GCATTGGAAA GCCAGGAGCT 900 

6GTGGA6CCC CAGOCXAGCC AGGGATTCCA GOAACAAAAG GTCTCCCTGG GGCTCCAGGA 960 

ATAGCTGGGC CCOCAGQGCC • TCCTGGCTTT GGGAAACCAG GCTTGCCAGG CCTGAAGGGA 1020 

GAAAGAGGAC CTQCTGGCXrP TCCTGGGQGT CCAGGTGCCA AAGGGGAACA AGGGCCSUSCA 1080 

GQTCTTCCTQ GQAAQCCAGO TCTGACTG6A CCCXXTTCGGA ATATQGGACC CCAA6QACCA 1140 

AAAGGCATCC 0GGGTA6CCA TGGTCTCCCA GGCCCTAAAG GTGAGACAGG GCCAGCTGGG 1200 

CCTGCAGGAT ACGCTGGGGC TAAGGGTGAA AGGGGTTCCC CTGQGTCAGA TGOAAAACCA 1260 

GGGTACCCAG GAAAACCAGG TCTCGATGGT CCTAAGGGTA ACCCAGGGTT ACCAGGTCCA 1320 

AAAG6TGATC CTGGAGTTGG AGGACCTCCT GGTCTCCCAO GGCCIGTGG6 CCCAGCAGGA 1380 

GCAAAGGGAA TGCCGGGACA CAATG6AGAG GCTGGCCXAA GAGGTGCCCC TGOAATACCA 1440 

GGTACTAOAG GCCCTATTGG GCCACCAGGC ATTCCAGQAT TCCCTGGGTC TAAAGGGGAT 1500 

CCAGOAAGTC CCGGTCCTCC TGGCCCAGCT GGCATAGCAA CTAAGGGCCT CAATGGACCC 1560 

ACCGGGCCAC CAGGGCCTCC AGGTCCAAGA G6CC31CTCTG GAGAGC CTGG TCTTCCAGGG 1620 

CCCCCTG6GC CTCCAGGCCC ACCAGGTCAA OCAOTCATGC CTGAGGGTTT TATAAAGGCA 1680 

GGCCAAAGGC GCAGTCTTTC TGGGACOCCT CTTGTTAGTG CCAACCAG60 GGTAAGAGGA 1740 



348 
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AT6CCT6TGT CTGCTTTTAC TGTTATTCTC 
ATACCATTTG ATAAAATTTT GTATAACAGG 
TTTACTTGTC AGATACCAGG AATATACTAT 
CATGTTTGGG TAGGCCTGTA TAAGAATGGC 
ACCAAA6GCT ACCTGGATCA GGCTTCAGGG 
CAGGTGTGGC TCCAGCTTCC CAATGCCXIAG 
CACT C CT C TT TCTCAGGATT CCTAGTGGCT 
TAAATCTTGT GCTAGAAAAA GCATTCTCTA 
AGGTAGGCTG AAAAGAATGT AATTTTTATT 
AACA7VACCTT CCCCCTGAAA AGTGAGCAGC 
AATTTCTAOT TAGCAATCTT AAQGCTCTTT 
CAAAGAAGTC CTGCIATGTT AAAAACAAAC 
TAAAAAAAAA AACAGAAATA GAGCTCTAAG 
ATTTCCTTTT TAAAAAAGCC TGTTTCTAAC 
AGGAOGTATC ATATAACTTT GTAGAACTTA 
TOTATCCCCT AAAATATTTC T6ATGGT6CA 
CAATATCTAT TCAAATATAC ASQTOCATAT 
CCCAAAATAT TOAAGTPCAT CT GAAATG CA 
CTTTTCTATG ATTGCA6AGA AGCTTTTTAT 
GACCTATTCT TATTTAGTTA ACACAAGTGT 
AATCTTATGT GATATGATTT TCTGGATTTA 
CCATTCAAGT OAAGTTATAA TTTACACTGA 
ATATTATTTA TrTATGCACT GTACTGTATT 
TGCCTCACTT ATTAAAGCAC AAAATGTTTT 
AACATCAATA GATTTTTAGG CTCAATTAAT 
TTCTTTCAAG GCTTTTCATT CGACACSU^TA 



TCCAAAGCTT 
CAACA6CATT 
TTTTCATACC 
ACCCCTGTAA 
AGTGCCATCA 
TCAAATGGCC 
CCAATGTGAG 
ACTCTACKCC 
TTCTGAAATA 
AACGTAAAAA 
AAGGTTTTCT 
AACAAAAAAC 
TTATGTGAAA 
TATGAATATG 
AATACTTGAA 
CTACTCTGAG 
ATACTTGTTA 
AGGTGCTTTC 
ATACCCAGCA 
GATTAATTTG 
CAGAACATTA 
GGGTTTCAAA 
TTTATATTGC 
ACCTACTCCT 
TTOAAAGCAG 
AAATAACATC 



ACCCAGCAAT 
AT6ACCCAA6 
ACOTGCATGT 
TGTACACCTA 
TCGATCTCAC 
TATACTCCTC 
TACACCCCAC 
ACCCTACAAA 
CAGATTTGAG 
CGTATGT6AA 
CCAATATTAA 
AAAGCAACAA 
TTTGATTTGA 
AGAACTTCTA 
TATTCAAATT 
GCCTGTATGG 
AAGCTCTTAT 
ATCAATGAAC 
TAACTTG6AA 
ATTTCTTTAA 
GCACATGTAC 
ATTCX3ACTAG 
TGTTTAAAAC 
TATTTACGAC 
CAATTTGCTO 
AATAG 



AGGAACTCCC 
GACTGGAATC 
GAAAGGGACT 
TGATGAATAC 
AGAAAATGAC 
TGAGTATQTC 
AGAGCTAATC 
ATGCATATGG 
CTATCAQACC 
GCCTCTCTT6 
AAAATATCAC 
AAAAAAAAAT 
GAAACTCGGC 
GGAAACATCC 
TAAAAGACAC 
CCCXITTTCAT 
ATAAAAAAGC 
CTTTTCAAAA 
ACAGGTATCT 
TTOCTTATTG 
CTTGTGCCTC 
AAGTGGAGAT 
TTTTAAGCTG 
ACAATAAAAT 
TTCTCAACCA 



1800 
1B60 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2S20 
2580 
2640 
2700 
2760 
2820 
2B80 
2940 
3000 
3060 
3120 
3180 
3240 



Seq ID NOt 435 Protein sequence 
Protein Accession #i NP_000484.2 



1 

I 

MX^IPFIiliL 
TPGPPGPAGP 
GPYGPKGDVG 
VPGMNGQK6B 
GESIGPIGPPG 
POLPGLKGER 
K8ETGPASPA 
PGPVGPAGAK 
ATKGLNGPTO 
SANQGVTO^P 
YHVHVKGTHV 
GLYSSEYVHS 



11 
1 

VSLNbVHGVF 
RGHPGPSGPP 
PAGLP6PR6P 
M6YGAP6RPG 
FQGPPGER6P 
GPAGLPGGPG 
GYPGAX6ERG 



21 

1 

YAERYQMPTG 
GKPGYGSPGI* 
PGPPGZPGPA 



PPGPP6FRGR 
VSAPTVILSK 
WVGLYKNGTP 
SFSGFLVAPM 



BGIGKPGAAG 
AKGBQ6PAGL 
SP6SDGKPGY 
PROAPQIFGT 
SGEPGIiPGPP 
AYPAIGTPIP 
VMYTYDBYTK 



31 

1- 

IKGPLPNTKT 
QGEPGLPGPP 
GISVPGXPGQ 
TGPSSPPGVG 
APGQPGXPGT 
PGKPGLTGPP 
PGXPGIJ)6PK 
RGFIGPF6IP 
6PFGPFG0AV 
PDKILYHRQQ 
GYLDQASGSA 



41 

1 

QFFIPYTIKS 
GPSAVGKPGV 
QOPTGAPGFR 
KRGEN6VPGQ 
KGLPGAPGIA 
GHMGPQGPKG 
GNPGXiPGPKG 
GFPGSKGDPG 
MPBGFIKAGQ 
HYDPRTGIFT 
IISLTENDQV 



51 
I 

K6XAVRGEQ6 
PGLP^CPGER 



P6IX(3)RGFP 
GPPGPPGFGK 
IPGSEGLP6P 
DPGVGGPPGti 
SPGPPGPAGI 
RPSLSGTPLV 
GQIPGIYYFS 
miQLPNAESN 



Seq ZD NO: 436 DNA sequence 
Nucleic Acid Accession #t XM_0£2811 
Ooding sequence: 1..888 



1 

•I 

ATGTGGGGOG 
CTGCTGCTGG 
TGGCTG6AG6 
GGCOAOGCCA 
GC36GGCCIGG 
GGGGOGGACA 
GTTGGCTCCG 
AGATGTCTCC 
ATGGA6ACCA 
TCCAGCACAG 
AGGTCACAGA 
CCCACGAATT 
CA6TATCTGC 
GCIGTGCCAC 
CCTCACACCA 



11 
1 

CTOGCCGCTC 
CTGCGCTGCT 
C6CAGGGCGT 
CCATCTGCTG 
ACCAGGGCGG 
AAGAOGGCCC 
TGTTTGTCGC 
GGCCTAAGCA 
TCCCCATGAT 
CTGCCAGTTC 
CCAACTGTT6 
TCTCTGTGCT 
ATCCCCCATA 
CTTTCATQQA 
ACA6T6AACA 



21 

I 

GTCCGTCTCC 
GGCGGCGGGG 
CTGGCGCATC 
CGGCAGCTGC 
CTGCGACAAT 
CGACGGCTCG 
CTTTATCATC 
GGATCCCCAG 
CCCCA6TGCC 
CAGCTCCA6C 
CTtGCGGGAA 
GAACT6TCAG 
CGTGGGGTAC 
GQGCCTGCAQ 
GAAGATGTAC 



31 

1 

TCATCCTGGA 
GCGAGGGCCA 
GGCTTCCAGT 
GCGTTG(X3CT 
GACOGCCAGC 
GCA6T6CCCA 
TT6660TCCC 
CAGAGCCGAG 

AGCACxrrccc 

GCCAACTCAG 
GGGACCATGA 
CAGGCCACCC 
AC8C3TGCAGC 
CCTG8CTACA 
CCAGOaOTGA 



41 

1 

ACGCXX3CTTC 
GOGGOQAOTA 
GTCCXX3AGC6 
ACTGCTGCTC 
AG6GCGCTGG 
TCTAC3GTGCC 
TGGTOGCAGC 
CCCCAGGGG6 
GGGGGTOGTC 
GGGCCX33GGC 
ACAAOGTGTA 
AGATTGTGCC 
AGGACTCTGT 
G6CA0ATTCA 
CTGTATAA 



SI 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 



I 

GCTCCTGCAG 
CTG0CA0Q6C 
CTTOGAOGGC 
CAGCGCCGAO 
CGAGCCTGGC 
GTTCCTCATT 
CrOTTGCTGC 
TAACC6CTT6 
CTCACGCCAG 
GCCCXXAACA 
TGTCAACATG 
ACATCAAGQG 
GCCCATOACA 
OTCCCCCTTC 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 



Seq ID NO: 437 Protein sequence 
Protein Accession #: XP_062811 



11 



21 31 41 51 

jlWGARRSSVS SSWKAASLLO LlAAIOAAG ARASGEVCHG WLDAQGVWRI GPQCPBRPDG 
CTATICCGSC ALRYCCSSAE ARLDQGGCDN DRQQGAGEPG SADKDGPDGS AVPIYVPFLI 
VGSVFVAFII LGSLVAACCC ROiRPKQDPQ QSRAPGGSRL METIPMIPSA STSRQSSSRQ 
SSTAASSSSS ANSGARAPPT RSQTNCCLPE GTMNNVYVNM PTNFSVLNCQ QATQIVPHQO 
QYLHPPYVGY TVQHDSVPMT AVPPP^a3G^iQ PGYRQIQSPP PHTNSBQKMY PAVTV 



60 
120 
180 
240 



Seq ID NO: 438 DNA sequence 

Nucleic Acid Accession »: NM_004004.l 

Coding sequence : 1 . . 681 v 



] Y f r r i" . 

ATGGATTGGG 6CACGCTGCA GACGATCXJTG GGGGGTGTGA ACAAACACTC CACCAGCATT 
GQAAAGATCT GGCTCACCGT CCTCTTCATT TTTCGCATTA TGATCCT06T TGTGGCTGCa 
AAGGAGGTGT GG6GAGATGA GCAGGCCGAC TTTQTCTOCA ACACCCTGCA GCCAGGCTGC 



60 
120 
180 



349 



wo 02/086443 

AAGAAOGTOT GCTACGATCA CTACTTCCCC ATCTCCCACA TCCGQCTATO GGCCCTGCAG 240 

CTGATCTTCG TQTCCAGCCC A6C3GCTCCTA GTGGCCATGC ACQTGGCCTA CCGQAQACAT 300 

GAGAAGAAGA GQAAGTTCAT CAAGGGGGAG ATAAAGAGTG AATTTAAGGA CATCGAGGA6 360 

ATCAAAACCC A6AAGGTCCG CATCGAAGGC TCXXHtSTGOT GGACCTACAC AA6CA6CATC 420 

TTCTTCCGGG TCATCTTCGA AGCC6CCTTC ATGTACGTCT TCTATGTCAT GTACGACGGC 4 BO 

TTCTCCATGC AGCG6CTGGT GAAGTGCAAC GCCTGGCCTT GTCCCAACAC TGTGGACTGC S40 

TTTGT6TCCC G6CCCA006A GAAQACTGTC TTCACAGTGT TCAT6ATTGC AGT0TCTG6A 600 

ATTTOCATCC tGCTGAATGT CACTGAATTG TCTTATTTQC TAATTftGATA TTOTTCtGGG €60 
AAGTCAAAAA AGCCAGTTTA A 

Seg ZD tlO: 439 Protein sequence 
Protein Accession #: HP_003 995.1 

1 11 21 31 41 51 

I I I I I I 

MDWGTLQTIL GGVNKHSTSI GKIWLTVLPI PRIMILWAA KBVWGDBQAD PVCaiTLQPGC 60 

KNVCyOHYFP ISHIRLWALQ LIFVSSPALL VAMHVAYRRH EKKRKFIKGB IKSBPKDIBB 120 

IKTQXVRIEO SLWWTYTSSI FFRVIPEAAP MYVFYVMYDQ FSMQRLVKCN AWPCPHTVDC 180 
FVSRPTEKTV FTVPMIAVSG IC1LU3VTEL CYLL1RYCS6 KSKKPV 



780 
840 
900 



1020 
1080 
1140 



Seq ID NO: 440 DNA sequence 

Nucleic Acid Accession XM_061091.1 

Coding sequence: 1..2481 

1 11 21 31 41 51 

1 I I I , I . I 

ATGCCAAATA CTTCAGGAAC AACCAGGATT GAAATTTGGC TTCTCCRAGA GCCGCCCGGG 60 

CACCGAGCGC TGGTGGCCGC TCTCCTTCCG 6TGAGTC0CA GCCCCQAGTT GGCTCTGGCG 120 

CCCGGGTAOC 03CCAOTCCC GGCTGCXXSAT GACCGATTCA OSCTOCCGAT GATTGGAGGT 180 

CAGATOCATG GTGAGAAGGT AGATCTCTGG AGCCTTQOTQ TTCTTTGCTA TGAATTTTTA 240 

GTTGGOAAQC CTCCTTTTGA GGCAAAC3GAA GTCCATGTAA GCAAAGAAAC CATCGGGAAQ 300 

ATTTCAGCTG CCAGCAAAAT GATGTGGTGC TOGGCTGCAG TGQACATCAT GTTTCTGTTA 360 

GATGGGTCTA ACAGOGTOGG GAAAOGGAGC TTTGAAAQGT CCAAGCACTT TGC CATCA CA 420 

GTCTQTOACO OTCTGOACAT CA0CCCC8A0 AOGGTCAGAG TGGGAGCAfT CCA0TTCAGT 480 

TCCACTCCTC ATCTGGAATT CCCCTTOQAT TCATTTTCAA CCCAACAGGA AOTOAAOGCA 540 

A6AATCAAGA GGATGGTTTT CAAAGGAGGG CGCACGGA6A OGGAACTTGC TCTGAAATAC 600 

CTTCTGCAC3V GAGGGTTGCC TGGAGGCAGA AATGCTTCTG TGCCCCAQAT CX^TCATCATC 660 

GTCACTGATG GQAAGTCCCA GGGQGATGTG GCACtGCCAT CCAAGCAOCT GAAGGAAAGG 720 
GGTGTCACTG TGTTTGCTGT GGGG6TCAG6 TTTCCCAGGT GG GAGG AGCT GCATGCACTG 
GCCAGCGAGC CTAGAGGGGA GCAGGTGCTG TTGGCTGAGC AGGTGGAGGA TGCOICCAAC 
GGCCTCTTCA GCACCCTCAO CAGCTCGGCC ATCTGCTCCA G06GCAOGOC AGCTGGGAGC 

CXXX5AGCTTG TCTTCATQGA GCGGTTAATG GGCATCTCTC TQATAGGCCX: CTGTGACTOG 960 
CAGCCCTGCC AOAATGQAQG CACATGTGTT GCAGAAGQAC TGOACGGCTA CCAGTGCCTC 
T00CC3GCTGa CCTTTGOAOO GGAGGCTAAC TOTOOCCTQA AQCTGAGCCT GGAATGCRGG 
OTOGACCrcC TCTtCCTGCT GQACftGCTCT QOGQOCACX» CTCTGGAOGO CTTCCTOCQG 

GCCAAAGTCT TOGTGAAGOG GTTTGTGOGG GCCGTGCTGA GC6AGGACTC TCGGGCOCGA 1200 

GTGGGTGTGG CCACATACAG CAGGGAGCTQ CTGGTGGOGG TGCCTGTGGG GGAGTACCAO 1260 

GATGTGCCTG ACCTGGTCTG GAGCCTCGAT GGCATTCXXTT TCCGTGGTGG CCCCACCCTG 1320 

AOGGGCAGTG CCTTGOGGCA GGCSGGCAQAO OGTrGGCTTGG GGAGOGCCAC CAGGACAGQC 1380 

a\GGACGGGC CAOGTAOAQT GGTGGTTTTG CTCACTGAGT CACACTCGQA GGATGAGGTT 1440 

GOGGGCCGVG OGOSTCACXSC AAGGGOGCGA GAGCTGCTCC TGCTGGGTGT AGGCAGTGAG 1500 

GCCGTGCGGG CAGAGCTGGA QGAGATCACA G6CAGCCCAA AGCATGTGAT GGTCTACTOS 1560 

GATCCTCAGG ATCTGTTCAA CCAAATCCCT GAGCTGCAGG GQAAGCTGTG CAGCCGGCAG 1620 

GGGCCAOQGT GCC3QQACACA AGCCCTGGAC CTCGTCTTCA TGTTGGACAC CTCTGC CTCA 1680 

GTAGG6CCCG ASAATTTTOC TCAGATGCAG AGCTTTGTQA GAAGCTGTGC CCTCCAGTTT 1740 

GAGGTQAACC CTGACGTGAC ACAGGTCGGC CTGQTGGTQT ATGGCAGCCA GGTGCAGACT 1800 

GCCTTCGGGC TGGACACCAA ACCCACCCGG GCTGOGATGC TGCGGGCCAT TAGCCAQGCC 1860 

CCCTACXnAO GTGGGGTGGG CTCAGCCGGC ACCGCXXTGC TGCACATCTA TGACAAAGTG 1920 

ATGACOGTCC AGAGGGGTGC CCGGCCTGGT GTCCCC3VAAG CTGTGGTGGT GCTCACAGGC 1980 

GGGAGAGGOG CAGAGGATGC AGCCX3TTCCT GCCX^GAAGC TGAGGAACAA TGGCATCTCT 2040 

GTCTTG6T0G TGGGCGTGGG GCCTGTCCTA AGTGAGGGTC TG0GGA6GCT TGCAGGTCCC 2100 

CGGGATTCCC TQATCCACX3T GGCSkGCTTAC GCOSAOCTGC GGTAGCACCA GGAOSTGCTC 2160 

ATTCAGTGGC TGTQTGGAQA AGCCAAGCAG CCAGTCAACC TCT6CAAACC CAGCCOGTGC 2220 

ATOAATGAGG GCAGCTGCQT CCTGCAGAAT GGGAQCTACC GCTGCAAGTG TOGGGATGGC 2280 

T666AGG0CC CCX:ACTGCGA GAACCGTGAG TGGAGCTCTT GCTCTGTATG TGTGAGCCAG 2340 

GGATGGATTC TTGAGACGCX: CCTGAGGCAC ATGQCTCCOO TGCAGGAGGG CAGCAGCCGT 2400 

ACCCCTCCXa GCAACTACAG A6AAGGCCTG GGCACTGAAA TGOfTGCCTAC CTTCTGGAAT 2460 
GTCTGTGCCC CAGGTCCTTA G 

Seg ID NO: 441 Protein sequence 
Protein Accession ft: XP.061091.1 

1 11 21 31 41 51 

11)111 

MPNTSGTTRI EIWLLQEPPQ HRALVAALLP VSPSPEbALA PGYPPVPAAD DRPTLPMIGG 60 

QMHGEKVDLN SZiGVLCYEFL VGKPPFEANE VKVSKETIGK ISAASKHMHC SAAVDINFU 120 

DGSNSVGKOS PBRSKHFAIT VCD6LDI8PE RVKVOAFQFS STPBLEFPLD SPSTQQBVKA 180 

RIKSMVFXGG RTBTELALKr LLBRGLPG6R NASVPQXIiZI VTrXSKSOOOV ALPSKQLRER 240 

GVTVFAVGVR PPRWEELHAL ASBPRGQHVL LAEQVBDATN GLFSTLSSSA ICSSATPAGS 300 

PELVFMERLM GISLIGPCDS QPOQNGGTCV PBQLDGYQCL CPLAFGQEAN CALKLSIiECR 360 

VDLLFLLDSS AGTTLDGFIiR AKVFVKRPVR AVLSEDSRAR V6VATYSREL LVAVPVQEYQ 420 

DVPDLVWSLD GIPFRGGPTL TGSALRQAAE RGFGSATRTG QDRPRRVWL LTESHSEDEV 480 

AGPARHARAR BLLLXjGVGSE AVRAELBEIT GSPKHVHVYS DPQDLFNQIP ELQGKZiCSRQ 540 

RPGCRTQAIiD IiVFMLDTSAS VGPEHFAQMQ SPVRSO^F EVNPDVTQVG LWYGSCJVQT 600 

AFGLDTKFTR AAMLRAISQA PYIiGGVQSAG TALIiHIYDKV MTVQRGARPG VPKAVWLTG 660 

GRGAEDAAVP AQKLRNNGIS VLWGVGPVIi SEGLRRIiAGP RDSLIHVAAY ADUIYHQDVL 720 



350 



5 
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WO 02/086443 

lENLCGBMCQ PVHWawSPC IWEGSCVLWI G8YRCKCRD0 WEGPHCEWEE WSSC8VCTSQ 780 
OHU^TPLRH MAPV0ES8SR TFPSHVIiBlHi OTBMVPTFni VCMOP 

Seq ID MOi 442 im sequence 

Nucleic Acid Accession 8: Bos sequence 

Coding' sequence! 1..2424 



PCTAJS02/12476 



1 

ATGCCCCCTT 
TCTCTCCCTC 
AGCAAAATGA 
AGCGTCX3GGA 
CTGGACATCA 
CTGGAATTCC 
ATGGTTTTCA 
QGGTTGCCTQ 
AAGTCCCAGG 
TTTGCTGTGQ 
AGAGGGCAGC 
ACCCTCAGCA 

ccctgtgagc 
agaggatcxk; 

AGAGTGTTCC 
TCGCAGCCCT 
CTCTGCCGGC 
AGGGTC6A0C 
CGGGCCAAAQ 
CGAGT6G6TG 
CAGGATGTGC 
CTGACGGGCk 
6GGCAGGAGC 
GTTG06G6CC 
QAGGCOGTGC 
TCGQATCCTC 
CAGCGGCCAG 
TCAGTAGGGC 
TTTGAGGTGA 
ACTGCCTTOS 
GCCCCCTACC 
GT6ATGAC06 
GGOOO GAGAG 
TCTGTCTTGG 
CCCCGG6ATT 
CTCATTGAGT 
TGCATGAATG 
GGCTGGGAG6 
CAGGQATG6A 
C6TACCCCTC 
AATGTCTOTO 



11 
I 

TCCTGTTGCT 
TCCAGGAAGT 
TGTGGTGCTC 
AAGGGAGCTT 
GCCCCGAGAG 
CCTTGGATTC 
AAGGAGGGOG 
6A66CA6AAA 
GGGAT6TGGC 
GGGTCAGGTT 
ACGTGCTGTT 
GCTCGGCCAT 
ACAGGACGCT 
GGC66ACCCT 
TAACCCACCC 
GCCyUSAATGG 
TGQCCTTTGG 
TCCTCTTCCT 
TCTTCGTOAA 
TGGCCACATA 
CTGACCTGGT 
GTGCCTTGOG 
GGCCAOGTAO 
CAGOC3C6TCA 
GGGCAGAGCT 
AGGATCTGTT 
GGTGCOSGAC 
CGQM3AATTT 
ACCCT6ACQT 
6GCTGGACAC 
TAGGTGGGGT 
TCCAGAGGGG 
OOQCAGAGOA 
TCGTGGGCGT 
CCCTC3ATCCA 
GGCTGTGTGG 
AGGGCAGCTG 
GCCCCCACTG 
TTCTTGAQAC 
CCAGGAACTA 
CCCCAQGTCC 




51 

1 

AGTGCCCCCA 

TTCAGCT6CC 

TGGGTrCTAAC 

CT6T6AGGC3T 

CACTCCTCAT 

AATCAAGAGG 

TCTGCACAQA 

CACTOATGIGO 

TGTCACTGT6 

CAQOGAGCCT 

CCTCTTCAGC 

06AGGCTCAC 

CCCATGCTGG 

CAGCTGGAAG 

CCCCTGTGAC 

CTACCAGTGC 

CXTTGGAATGC 

CGQCTTCCTG 

CTCTCQGGCC 

GGGGGAGTAC 

TGGCCCCACC 

CACCAGGACA 

CGAGGATGAG 

TGTAGGCAGT 

GATGGTCTAC 

GTGCAGCCGG 

CACCTCTGCC 

TGCCCTCCAG 

CCAGGTGCAQ 

CATTAGCCAG 

CIATGACAM 

GGTGCTCACA 

CAATGGCATC 

GCTTGCAGGT 

CCAG6AC6T6 

ACXXAGCCCG 

GT6T06G6AT 

AT6TGTGAGC 

QGQCAGCAGC 

TACCTTCTGG 



Seq ID NO) 443 Protein sequence 
Protein Accession U i Gos sequence 



MPPFLLLEAV 



KVFKQQBTET 
FAVGVRPPRH 
PCEHRTLENV 
SQP0QMG6TC 
RAKVFVKRFV 
LTGSALRQAA 
EAVRAELEBI 
SVGPENFAQM 
APYLGGVGSA 
SVLWGV6PV 
CMNEGSCVLQ 
RTPPSNYREG 



11 
I 

CVFLPSRVPP 
KBFAITVCZ}G 
ELAIiKYIJiBR 



RBFAGNAPCW 
VPB6LD6YQC 
RAVLSEDSRA 
ERGFGSATRT 
TGSPKHVMVY 
QSFVRSCALQ 
GTALLHIYDK 
LSEGLRRIiAG 
N6SYRCKCRD 
LGTBfVPTFW 



21 

1 

SLPLQEVHVS 
LDISPERVRV 
GLPG6RNASV 
RGQEVUiAEQ 
RGSRRTLAVIi 
LCPLAPGGEA 
RVGVATYSRE 
GQDRPRRVW 
8DFQDLFNQI 
FEVNPDVTQV 
VMTVQRGARP 
PRDSLIHVAA 
GWBGPHCEtIR 
MVCAPGP 



31 
I 

KETIGKISAA 
GAFQFSSTPH 
PQILXZVTD6 
VEDATmSiPS 
AAHCPFYSWK 
NCAIiKLSLSC 
LLVAVPV6EV 
liLTESESBDE 
FELQGKLCSR 
aLWYGSQVQ 
GVPKAWVLT 
YADLRYHQDV 
EHSSCSVCVS 



41 

I 

SKMMWCSAAV 
LEPPLDSFST 
KSQGDVALPS 
TLSSSAICSS 
RVFLTHPATC 
RVDLLFLU3S 
QDVPDLVWSL 
VAGPARHARA 
QRPGCRTQAli 
TAPGIiDTKPT 
G6RGABDAAV 
LIEWLCGEAK 
QGWIIiETPIiR 



Seq ID NO: 444 DNA sequence 

Nucleic Acid Accession #s Eos sequence 

Coding sequence t B 9 . . 2 3 56 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
B40 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1B60 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 



51 
I 

DIKFLIiDGSN 
QQBVKARZKR 
KQLKERGVTV 
ATPDCRVEAH 
YRTTCPGPCD 
SAGTTLDGFL 
DGIPFRGGPT 
RELLLLGVGS 
DLVFMLDTSA 
RAAMLRAISQ 
PAQKLBNNGI 
QPVNLCKPSP 
HMAPVQEGSS 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 



GCCCCCTGGC 
GTCGCCGCTC 
TGTTTTCCTG 
AGAAACCATC 
CATCAT6TTT 
GCACTTTGCC 
A6CATTCCAQ 
ACAQGAAGT6 
ACTTGCTCTG 
CCAGATCCTC 



11 

i 

CCGAGCCX3CG 
TCCTTCCGTT 
TTTTCCAGAG 
GGGAAGATTT 
CTGTTAGATG 

ATCAcnsrcr 

TTCAQTTCCA 
AAG6CAAGAA 
AAATACCTTC 
ATCATOGTCA 



21 

1 

CC0GGGTCT6 
ATATCAACAT 
TGCCCCCATC 
CAGCTGCCA6 
GGTCTAACAG 
GTOAGGGTCT 
CTCXrrCATCT 
TCAAGAOGAT 
TGCACAGAGG 
CTGATGGGAA 



31 
1 

TQAGTAGAGC 
GCCCCCTTTC 
TCTCCCTCTC 
CAAAATGATG 
C6TGGGGAAA 
G6ACATCAGC 
GGAATTCCCC 
GGTTTTCAAA 
GTTGCCTGGA 
GTCCCAGGOG 



41 
I 

CGCCCGGGCA 
CTQTTGCTGG 
CAGGAAGTCC 
TGGTGCrrCGG 
GGGA6CTTT6 
GOCGAGAGGG 
TTGGATTCAT 
G6AGG6CGCA 
GGCAGAAATG 
GATGTGGCAC 



51 

CCGAQCGCTG 60 

AAGCGGTCTG 120 

ATGTAAGCAA 180 

CTGCAGTGGA 240 

AAAGGTCCAA 300 

TCAGAGTGGG 360 

TTTCAACCCA 420 

CGGAGACGGA 480 

CTTCTGTGCC 540 

TGCCATCCAA 600 



351 



wo 02/086443 

GCAGCTGAAG GAAAGGGGTG TCACTGTGTT TGCTOTGGGa GTCAG6TTTC GCAG6TGGGA 660 

GOAGCTGCAT GCACTGGCCA GCOAGCCTAG AGGGCAQCAC GTGCT6TTQG CTGAGCAOGT 720 

GGAGGATGCC ACCAACGGCC TCTTCAGCAC CCTCAGCAGC TCX5GCCATCT GCTCCA60GC 780 

CACQCCAGAC TGCAG6GTOG AGGCTCACCX: CTOTQAGCAC AGGACGCTGG AGATGGTCX^G 840 

5 . GGAGTTCGCT GGCAATGCCC CATGCTGGAG AGGATOGCGG CGGACCCTTG CG6TGCTGGC ' 900 

TGCACACTGT CCCTTCTACA GCTGGAAGAG AGTGTTCCTA ACCCACCCTG CCACCTGCTA 960 

CAGQACCACC TGCCCAGGCC CCTGTGACTC GCAGCCCTGC CAGAATGGAG GCACATGTGT 1020 

TCCAGARGGA CTGGACGGCT ACCAGTGCCT CTGCCOGCTG GCCTTTGGAG GGGAGGCTAA 1080 

CTGTGCCCTG AAGCTGAGCC TGGAATGCAG GGTOSACCTC CTCTTCCTGC TGGACASCTC 1140 

10 TGCGGGCACC ACTCTG6AGG GCTTCCTGOS GGCCftAAOTC TTOGTGAASC aOTTTQIGCQ 1200 

GGCCGTGCTG AG06A66ACT CTCX3G6CG08 AGTG6GT6T0 GCCACATACA GCAGGQAGCT 1260 

GCTGGTG6CG QTQCCTGTGG GGGAGTAOCA GGATGTGCCT GACCTGGTCT GGAGCOTCX3A 1320 

TGGCATTCCC TTCCGTGGTG GCCCCACCCT GACGGGCAGT QCCTTGOGGC AQGOGGCAGA 1380 

GC6TGGCTTC GGGAGC6CCA CCAGGACAGG CCAG6ACCG6 CCAG6TAGA6 TGGTG6TTTT 1440 

IS GCrCACTGAS TCACACTCG6 AGGATGAGGT TGC3GGGCCCA GCX3CGTCAC6 CAAGGQC6C6 ISOO 

AGA6CT6CTC CT6CTGGGTG TAG6CAGT6A GGGGGT6G0G 6CAGA6CTGG AGGAGATCAC 1560 

AGGCAGCCCA AAGCATGTGA TGGTCTACTC GQATCCTCAG GATCTGTTCA ACCAAATCCC 1620 

TGAGCTGCAG GGGAAGCTGT GCAGCOGGCA GCGGCCAGGG TGCOGGACAC AAGCCXTTCGA 1660 

CCTCGTCTTC ATGTTGGACA CCTCTGCCTC AGTA6GGCCC GAGAATTTT6 CTCAGATGCA 1740 

20 GAGCTTTGTG AGAAGCTGTG CCCTCCAGTT TGAGGTGAAC OCXGAOGTGA CACAGGTCGG 1800 

OCTGGTGQTQ TATGGCAGCC AGGTGCAGAC TGCCTTOGGG CTGGACACCA AACCCACC06 1860 

GGCTGCGATG CTGCGGGCCA TTAGCCAGGC CCCCTACCTA GGTGGGGTGG GCTCAGCOSG 1920 

CACCX3CCCTG CTGCACATCT ATGACAAAGT GATGACCXSTC CAGAGGGGTG CCCGGCCTGG 1980 

- TGTCCCCAAA GCTGTGGTG6 TGCTCACAG6 CGGGAGAGGC GGA6AGGATG CAGCCX3TTCC 2040 

25 T6CCCAGAA6 CT8A9GAACA ATGGCATCTC TGTCTTGGTC 6TQG6GQTGG G6CCTGTCCT 2100 

AAGT6A6GGT CTQCGGAGGC TTQCAQGTCC CCQGGATTCC CTQATCCACX3 TGGCAGCTTA 2160 

CGCC6ACCTG 006TACCACC AOQACGTGCT CATTGAGTGG CTGTGTGGAG AAGCCAAGCA 2220 

GCCAGTCAAC CTCTQCAAAC CGAGCCCGTG CATQAATGAG GGCAGCTGOG TCCTGCAGAA 2280 

TGGGAGCTAC C6CTGCAAGT GTOGGGATGG CTG6GAGGGC CCCCACPGCG AGAACC3GATT 2340 

30 crraAOACGc cccigaggca catggctccc gtgcaggaog gcagcagciog taccxx:tccx: 2400 

AGCAACTACA GAGAAGGCCT GGGCACTGAA ATGGTGCCTA C5CTTCTGGAA TGTCTGTGCC 2460 

CCAGGTCCTT AGAATGTCTG CTTCCCGCCG TGGCCAGGAC CACTATTCTC ACTGAGGGAG 2520 

GAGGATGTCC CAACTGCAGC CATGCTGCTT AGAGACAAGA AAGCAGCTGA TGTCACCCAC 2580 

AAAOGATGTT GTTGAAAAGT TTTGATGTGT AAGTAAATAC CCACTTTCTG TACCTGCTGT 2640 

3S G0CTT6TTGA GGCTAT6TCA TCTGCCACCT TTCGCTTGAO OATAAACAAO GGGTCCTGAA 2700 

GACTTAAATT TAGOGGCCTG ACGTTCCTTT GCACACAATC AATGCTOGCC A8AKIQTTGT 2760 
TGACACA6TA AT6CCCAGCA GAGGCCTTTA CTAGAGOkTC CTTTGGACOO 



PCT/US02/12476 



40 



Seq ID KOt 445 Protein sequence 
Protein Accession ft: Eos sequence 



45 
50 
55 



NPPPLLLBAV 
SVGKGSFERS 
NVFBaQaRTBT 
FAVGVRFPRW 
PCEHRTLEMV 
SQPCQNGGTC 
RAKVPVKRFV 
LTGSALRQAA 
EAVRAELEBI 
SVGPEiTPAQM 
APYLGGVGSA 
SVLWOVOFV 
CMNBQSCVLQ 



11 
I 

CVFLPSRVPP 
XHFAXTVC33G 
ELALKYUJIR 
KKLHALASEP 
RBFACaiAPCW 
VPEGLDGYQC 
RAVLSEDSRA 
ERGFGSATRT 
TGSPKHVMVY 
QSFVRSCALQ 
GTALLUIYDK 
LSEGLRRLAO 
NGSYRCKCRD 



21 
I 

SLPLQBVHVS 
LDIBPERVRV 
GLPGGRNASV 
RGQHVLLAEQ 
RG8RRTLAVL 
LCPLAFGGEA 
RVGVATYSRE 
GQDRPRRVW 
SDPQDLPNQI 
PBVNPDVTQV 
VMTVQR6ARP 
PBOSLIHVAA 
GHEGPHCENR 



31 

I 

RBTIGKISAA 
GAFQFSSTPH 
PQZLIIVTDG 
VEDATNGLFS 
AAHCPPYSWK 
NCALKLSLEC 
LLVAVPVGEY 



PELQGKCtCSR 
GLWYGSQVQ 
GVPKAVWLT 
YADLRYHQDV 
FLRRP 



41 
I 

SKMMWCSAAV 
LEFPLDSFST 
KSQGDVALPS 
TLSSSAICSS 
RVPLTHPATC 
RVDLLFLLDS 
QDVPDLVWSL 
VAGPARHARA 
QRPGCRTQAL 
TAFGIiDTKPT 
GGR6ABDAAV 
LXBHLCGEAK 



51 

I 

DIMFIiLDGSN 
QQEVKARIXR 
KQLKERGVTV 
ATPDCRVEAH 
YRTTCPGPCD 
SAGTTLDGFL 
DQIPFRGGPT 
RELLLLGVGS 
DLVFMLDTSA 
RAAMLRAISQ 
PAQKLRtaNGZ 
QPVHLCKPSP 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 



60 
65 
70 
75 
80 
85 



Seg ID NO: 446 DNA sequence 

Kucleic Acid Accession #: NM_031942. 

Coding sequence: 145.. 1260 



1 

I 

CCGGAGCCCC 
TGCTCCTCCT 
CC6ATCTGGG 
GTAAASAAGA 
TCCTCTGATG 
TCAGTTOGGG 
GCGATGAA6T 
CaGCCCTCAG 
AATTTTTTGG 
ATOTCTOAAT 
GACTC»CAAT 
CCTOAACQGA 
GCTCTACCCA 
ACG6TGGATG 
GTGACCCTTC 
GTCTGCAGCA 
TGCCGTCAGA 
C6AGGCCAGT 
CTGCTGGATC 
08GCAGGGAG 
TTTGGGAATG 
TATCTGGAAA 
TTTTTTCACT 
AATCAAGTTA 



11 
I 

GCCCCTCC6G 
GCTGTGQGAC 
CACCCGCCAC 
ACTTAAA6AA 
ACAGTTQTGA 
AAGGCTGTAG 
TTCCAGCGCG 
AGAATTCTGT 
AGAAAAQGGC 
TAGAAAQCTT 
CAAGGAGACXr 
GAGCTOGTCC 
TGGAGGAGGA 
GCTACATOAA 
OSCATATAAT 
ATTCTOQAGA 
AGACTATTGA 
TCTGTGQCCC 
OGAACTGGCA 
ATGGACQGTG 
TGCAT6CCTA 
ATTPGCTGCC 
GAAACCTGAG 
ATCTTAGCAG 



21 
I 

GCCCGGGT06 
CGCTGACCGC 
CAGCATGGAC 
ATTCAGATAT 
CAGCTTTGCT 
GACCC6CA6C 
GAGTACCAGQ 
GACTGATTCC 
TTTAAATATA 
OCCTGGCTCG 
6CGAAGG0GT 
TCTTACCAGG 
GGAGGAAGAG 
TGAAGATGAC 
TOGGCCAGTQ 
GAAGATATAT 
TACCAAAACA 
CTGCCTTCGA 
TTGCOCGCCT 
TGCQACTQGQ 
CTTGAAAAGC 
TGCCTTCTAC 
TTAAAAATCT 
ACATGTGTTT 



31 
I 

G06CGCCCAG 
GCX3GCTGCTC 
GCTCGCCX5CG 
GTGAAGTTGA 
TCTGATAATT 
CAGTGCAGGC 
GGAGCAACCA 
AACTCCGATT 
AAGCAAAACA 
TTCCGTaOAA 
ACATTCCOGG 
TCAAG6TCCC 
GATAAGTACA 
CTGCCCA6AA 
GAAGAAATTA 
AACCGTTCAC 
AACTGCAGAA 
AACCGTTATG 
TGTCGAGGAA 
GTCCTT6TGT 
CTQAAACAGG 
TTCTCAAATC 
TGATGATCAG 
CTGGAGCATC 



41 

1 

CCTGCCAGCC 
CGCTCTCCCC 
TGCCCCAGAA 
TTTCCATGGA 
TTGCAAACAC 
ACTCTG6ACC 

CAGAAGATGA 
AAGCAATGCT 
GACATOCGCT 
GTGTT6CTTC 
GGATCCTOQG 
TGTTGGTGAG 
GCCGTCGCTC 
CAGAGGAGGA 
TGGGCTCTAC 
ACCCAGACTG 
GTGAAGAGGT 
TCTGCAACT6 
ATTTAGCCAA 
AATTTGAAAT 
TTTCTTGTAA 
CCTQTTTCAT 
ACA6AAGGTA 



51 

1 

GCGCTGCTGC 
GCTCCAAGCG 
AGATCTCAGA 
AACCTOGTCA 
0AGGCT6CAG 
TCTCAGGGTG 
AGAGTCCCGC 
AAGTGGAATG 
TGCAAAACTC 
CCCAGGCTCC 
CAGGAGAAAC 
OTCCCTTGAC 
AAAGA6GAAG 
CAGATCATCC 
GTT6GAGAAC 
TTGTCATCAA 
CTGGGGCGTT 
CAGGGATGCT 
CAGTTTCTGC 
ATATCATGGC 
GCAAGCATAA 
AAGTTTCCAA 
AAQAAACTCC 
TATTGCTAGT 



60 
120 
160 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 



352 



5 

10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



WO 02/086443 

TACACTTTGC CCTCCTGCAG TTTCTTCTCT GCTCCCAACC CCCATCTCAT AGCATCCCCC 
TCTATTTCCA ATGCTCCTCT CCAACCGCTT AOTTTCTQAA TTTCTTTTAA ATTACAGTTT 
TATGAAAGCA TATTTTATTP ACTT6GTGTT GAAATAGCCC TCATAAAACC TAAGCACTTG 
GAAACACAAT AATAGTATTA ACTAACTAGA TCTATTGAAT TTCAGAGAAG AGCCTTCTAA 
CTTGTTTACA CAAAAACGAQ TATQATTTAG CACTCATACT AGTTGAAATT TTTAATAlGAA 
TCAAGGCACA AAAGTCTTAA AACCATQTGG AAAAATTAGG TAATTATTGC AGATTOATGT 
CTCTCAATCC CATGTATTGC GCTTATCTTA CAAGTT6TTG TCACAGTTGA GACTTAATTT 
CTCCtAATTT CTTCTGCCCO AAGGGTAAGT GGTGCGTCCA GCTTACAOGA TCATAATTCA 
AAGGTTGGTG GGCAATGTAA TACTTAATTA AAATAATGAT GGAAGAGCTA TCTQGAGATT 
ATXSA6TAAGC TGATTTGAAT TTTCAGTATA AAACTTTAGT ATAATTGTAG TTTGCAAAGt 
TTATTTCAGT TCACATGTAA GGTATTGCAA ATAAATTCTT GGACAATTTT GTATGGAAAC 
TTGATATTAA AAACTA6TCT GTGGTTCTTT GCAGTTTCTT GTAAAiTTTAT AAACCAGGCA 
CAAGGTTCAA GTTTAGATTT TAAGCACTTT TATAACAATG ATAAGTGCCT TTTTG GAGAT 
GTAACTTTTA GCAGTTTGTT AACCTGACAT CTCT6CCAGT CTAGTTTCTG GGCAGGTTTC 
CTGTGTCAGT ATTCCCCCTC CTCTTTGCAT TAATCAAGGT ATTTQ6TA6A GglGGA ATCT 
AAGTGTTTGT ATGTCCAATT TACTTGCATA TGTAAACCAT TGCTGTGCCA TTCAAT GTTT 
GAT6CAIAA.T TG6ACCTT6A ATGGATAAGT GTAAATACAO CTTTTGATCT 6TAATGCTTT 
TATACAAAAG TTTATTTTAA TAATAAAATO TTTGTTCTAA AAAAAAAAAA 



PCT/US02/12476 



1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 



Seq ID NO I 447 Protein sequence 
Protein Accession #: 1IP_114148.1 



1 11 21 31 

t I I 

MDARRVPQKD liRVKKNLKKF RTVKIilSHBT 
RSQCRHSOPL RVAMKFPARS TRGATNKKAE 
NIKQNKAMIA KLMSELESFP GSFRGRHPLP 
TRSRSRILG8 LDALPMEEBE EEDKYMLVRK 
PVEBITEEEL ENVCSNSREK lYNRSLGSTC 
LRNRYGEEVR DALLDPKWKC PPCRGIOICS 
KSLKQEFEMQ A 



41 

I 



51 



SSSSDDSCD8 PASDNFAIITR LQSVSEQCRT 
8RQPSC1TSVT DSNSDSBDES GKHFLEKRAL 
GSDSQSRRPR RRTFPGVASR RNPERRARPL 
RKTVDGyKNE DDLPRSRRSR SSVTLPHUR 
HQCRQKTIDT KTNCRNPDCW GVRGQFCGPC 
FCRQRlXatCA TGVLVYLAXy BGFGNVHAYL 



Seq ID NO: 448 DNA Bequence 
Nucleic Acid Accession #t NM_019894 
Coding sequence: 1..1314 



ATGTTACAiSG 
AAACCCOGTA 
CTGAGCCTGG 
TACTTCCTCT 
CTGGACTGTC 
GCAGTGGCAG 
QGQAACTGOT 
AGGCAGATGG 
GATCTGGATG 
GGGCCCTGTC 
AAGACCCCCC 
AGCATCCAOT 
CTO^COOCAG 
GGCTCAGACA 
TTCAACCCCA 
ACTTTCTCA6 
GCCACCCCAC 
GACATACTGC 
GCGTACCAGG 
GACACCTGCX 
GTGGGCATC6 
AAGSTCTCAO 



11 
I 

ATCCTGACAG 
TCCCCATGGA 
CX3AGTATCAT 
GCGGGCAGCC 
CCTTGQGGGA 
TCCGCCTCTC 



GCTACAGCAQ 
TTGTTGAAAT 
TCTCAGGCTC 
GTGTGGTGGG 
AGGACAAACA 
CCCACTGCTT 
AACTGGGCAG 
TGTACCCCAA 
GCACAQTCAG 
TCTGGATCAT 
TGCAGGC9GTC 
G6GAA6TCAC 
AGGGTGACAG 
TTAGCTGGGG 
CCTATCTGAA 



21 

I 

TGATCAACCT 
GACCTTCAQA 
CATT6TGGTT 
TCTCCACTTC 
OGACGAGGAG 
CAAGGACC8A 
TTTGGACAAC 
C!AAACCCACT 
CACAGAAAAC 
CCTGGTCTCC 
TGGG GftGGAG 
GCACXSTCTGT 
CAG6AAACAT 
CTTCOCATCC 
AGACAATGAC 
GCCCATCTGT 
TGGATG6Q6C 
AGTCCAGGTC 
C6AGAA6ATG 
TOGTGGGCCC 
CTATGGCTGC 
CTGGATCTAC 



31 

I 

CTGAACAGCC 
AAGGIGQGQA 
GTCCTCATCA 
ATCCOGAGGA 
CACTGTGTCA 
TCX31GVCTGC 
TTCACA6AAG 
TTCAGA6CTG 
AGCCAGGAGC 
CTGCACTGTC 
GCCTCTGTGQ 
GGAGQGAGCA 
AC0QATGT8T 
CTGGCTGT6G 
ATC3GCCCTCA 
CTGCCCTTCT 
TTTACGAAGC 
ATTGACAGCA 
ATGTGTGCAG 
CTGATGTACC 
GGG6GCC06A 
AATGTCTGGA 



41 

I 

TOGATGTCAA 

Tcccca iTCAy 

AGGTGATTCT 

AGCAGCTGTG 
AGAGCTTCCC 
AGGTGCTGGA 
CTCTOGCTGA 
TGGAGATTGG 
TTC6CATGCG 

ATTCTTGGCC 
TCCTGGACCC 
TCAACTGGAA 
CCAAGATCAT 
TGAAGCTGCA 
TTGA1GAG6A 
AGAATGGAGG 
CACGGTGCAA 
GCATCCCGGA 
AATCTGACCA 
GCACCCCAGG 
AGGCTOAOCT 



51 

1 

ACCXrCTGCGC 
CATA6CACTA 
GOATAAATAC 
TGACGGAGAG 
CQAAQGGCCT 
CTCGGCCACA 
GACAGCCTGT 
CCCAGACCAG 
GAACTCAAGT 
GAAGAGCCTG 
TTGGCAGGTC 
CCACTG6GTC 
GGTGCGGGCA 
CATCATTGAA 
GTTCCCACTC 
GCTCACTCCA 
GAAGATGTCT 
TGCAGACGAT 
AGGGG6TGTG 
GTGGCATQTO 
AGTATACACC 
GTAA 



Seq ID NO: 449 Protein sequence 
Protein Accession #i NP_063947.1 



^aJlQDPDSDQP 

YPLCGQPIiHP 
GNKFSACFDN 
GPCLSGSLV8 
LTAAHCFRKH 
TFSGTVRPIC 
AYQGEVTERM 



11 

1 

LNSLDVKPLR 
IPRKQIiCDGE 
FTEALABTAC 
IiHCLACGKSL 
TDVFNMKVRA 
LPFFDEELTP 
MCAGIPE6GV 



21 
1 

KPRIPMETFR 
LDCPIdQEEDEB 
RQMGYSSKPT 
KTPRWGGEE 
GSDKLGSFPS 
ATPLNIIGWG 
DTGQGDSG6P 



31 41 

i 1 
KVGIPIIIAL I.SLASIIIW 
HCVKSFFE6P AVAVRLSXDR 
FRAVBIGPDQ DLDWBITEM 
ASVDSWPWQV SIQYDKQHVC 
LAVAKIIIIE PNPMYPKDND 
FTKQNGGKM6 DILLQA5VQV 
LHYQSDQMHV VGIVSHSXQC 



VLIKVILDKY 
STDQVWSAT 



GGSILDPHWV 
lAIiMKLQPPL 
IDSTRCNADD 
OOPSTPGVYT 



Seq ID NO: 450 DNA sequence 

Nucleic Acid Accession «t XM_051860.2 

Coding sequence t 52 . . 3042 



1 

I 

GCTCACCCAQ 
GTTAAGCTCA 
6AC0GGGGCA 
CCCAAACTCA 
AATGTACAGT 
TACCAGGCAG 



11 
I 

QAAAAATATG 
GCACGQAGGT 
GA6CCT6CCG 
CAGTCACCAT 
CATGGAAACC 
AAGAGTTCCA 



21 

1 

CAATCGTCCC 
TGTCTACAAA 
GAGCTACCX3T 
TGACACCAAT 
TGGAGATACC 
GGTGCTTCCC 



31 

I 

ATT6ATATAC 
AAAGGGCAGQ 
6TA0GGTTCC 
GTGAACA6CA 
CTGGTCATTG 
TGCAGATCCT 



41 



51 



60 
120 
180 
240 
300 
360 



60 
120 
180 

240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 



60 
120 
180 
240 
300 
360 
420 



AGGCCACTAC AATGGATGGA 
ATTATA6GTT TGCTTGCTAC 
TCTGTG6QAA GCCTGTGAGG 
CCATTCTGAA CTTGGAGGAT 
CCAGTACTGA TTACTCCATG 
GCGCCCCCAA CCAGGTCAAA 



60 
120 
180 
240 
300 
360 



353 



wo 02/086443 

GTGGCAGGGA AACCAATGTA CCT6CACATC G(3GGAGGAGA TAGACGG06T GGAOVTGCGG 420 
GCGGAGGTT6 G6CTTCTGAG CXXSGAACATC ATAGTGATGG GGGAGATGGA GGACAAAT6C 480 
TACCCCTACA GAAACCACAT CTGCAATTTC TTTOACTTCG ATACCTTTGG GGGCCACATC 540 
- AA6TTTGCTC TGGGATTTAA GGCAGCACAC TTGGA6GGCA OGGAGCTGAA GCATAT6GGA 600 
5 CAGCAGCTGG TGGGTCAGTA CCOGATTCAC TTCCACCTGG COGGTGATGT AGAC6AAAGG 660 
GGAGGTTATG ACCCACCCAC ATACATCAGG GACCTCTCCA TCCATCATAC ATTCTCTCGC 720 
TGCGTCACAG TCCATGGCTC CAATGGCTTG TTGATCAAGG ACGTTGTGGG CTATAACTCT 780 
TTGGGCCACT GCTTCTTCAC GGAAGATGGG CCGGAGGAAC GCAACACTTT TGACCACTGT 840 
CTTGGCCTCC TTGTCAAGTC TGGAACCCTC CTCCCCTCGG ACOSTGACAG CAAGATGTGC 900 
10 AASATGATCA CAGtSAOACTC CTAOCCAGSG TACATCOCCA A6CCCAG0CA AGACTGGAAT 960 
6CTGTGTCCA CCTTCTGGAT GGCCAATOCC AACAACAACC TCATCAACTG T6006CTGCA 1020 
GGATCTGAQG AAACTGQATT TTGGTTTATT TTTCACCAOG TACCAAC3GGQ CCCCTCOSTG 1080 
GGAATGTACT CCCCAGGTTA TTCAGAGCAC ATTCXACTGG GAAAATTCTA TAACaU^CCQA 1140 
^ - GCACATTCCA ACTACCGGGC T6GCATGATC ATAGACAACX3 GAGTCAAAAC CACCXSAG6CC 1200 
15 TCTGCCAAGG ACAA6G6GCC GTTCCTCTCA ATCATCTCTG OCAGATACAG CCCTCAGQU3 1260 
GAG6C0GACC CGCT6AA6CC CCX3GGAGCC6 6CC31TCATCA GACACTTCAT TGCCTACAAG 1320 
AACCAGGACC AOGGGGCCTG GCTGCGCX3GC GGGGATGTGT GGCTGGACAG CTGCCGGTTT 1380 
GCTGACAATG GCATTGGCCT GACCCTGGCC AGTGGTGGAA CXTTTCCCGTA TGAC6ACGGC 1440 
TCCAAGCAAG A6ATAAAGAA CAGCTT6TTT GTT6GCGAGA GTGGCAACX3T GGGGACGGAA 1500 
20 ATGATGGACA ATA6GATCT6 GGGCGCIGGC GGCTTGGACC ATftGGOGAAG 6AC0CTCXX:T 1560 
ATAGGOCAGA ATTTTCCAAT TAGAGGAATT CAGTTATATG ATGQCCCCAT CAACAT0C3UV 1620 
AACTGCACTT TCOGAAAGTT TGTGGCCCTG GAGGGCCGGC ACACCAGCGC CCTGQCCTTC 1680 
CGCCTGAATA ATGCCTGGCA QA6CTGCCCC CATAACAACG T6ACC3GQCAT TGCCTTTQAQ 1740 
. GAC9GTTCC6A TTAC7TCCAG AGTGT7CTTC 6GAGA6CCTG G6CCCTGG7T CAACCAGCTG 1800 
25 OACATGOATG G68ATAAQAC ATCTOTGTTC GATGAOGTCG AOGGCTCOST GTCaSAOTAC 1860 
CCTGGCTOCT ACCTCAOQAA GAATQACAAC TG6CTGGTCC GGCACCCAGA CTGCATCAAT 1920 
GTTCCCGACT GGAGAGGGGC CATTTGCAGT GGGTGCTATG CACA6ATGTA CATTCAAGCC 1980 
TACAAGACCA GTAACCTGCG AATGAAGATC ATCAAGAATG ACTTCCCCAG CCACCCTCTT 2040 
TACCTGGAGG GGGCGCTCAC CAGGAGCACC CATTACCAGC AATACCAACC GGTTGTCACC 2100 
30 CT6CA6AA00 OCTACAOCAT CGACTGGGAC CSUSACGGCCC CCGCOGAACT OGCCATCTOG 2160 
CTCATCAACT TCAACAAGG6 CGACT6GATC CGAGIGG66C TCTGCTACCC 60GAGGCACC 2220 
ACATTCTCCA TCXnTCTCGGA TGTTCACAAT CGCCTGCT6A AGCAAACX3TC CAAGACGG6C 2280 
GTCTTCXTPGA QGACCTTGCA GATGGACAAA GTGGAGCAGA GCTACCXTrGG CAGGAGCCAC 2340 
^ TACTACTGGG AOQAGGACTC A6GGCT6TTG TTCCT6AAGC TGAAA6CTCA 6AAGGAGAOA 2400 

35 aAGAAGTTTO CTTTCTOCTC CATOAAAGGC TOTOAGAaGA TAAAOATXAA AOCTCTOATT 3460 
CCAAAGAAGG CAGGOSTCAG T8AGT6CACA GCCACAGCTT ACCCCAA6TT CACGGAGAGO 2520 
GCTGTCGTAG ACGTGCCGAT GCCCAAGAAG CTCTTTGGTT CTCAGCT6AA AACAAAGGAC 2580 
CATTTCTT6G AGGTQAAGAT GGAGAGTTCC AAGCAGCACT TCTTCCACCT CTGGAAOQAC 2640 
_ TTCGCTTACA TTGAAGTGGA TG6GAAGAAG TACCCCAGTT GGGAG6ATGG CATCCAGGTG 2700 
40 GTGGTGATTG ACX3G6AACCA AOGGCGCGTG GTGAGCCACA CGAGCTTCAG GAACTCCATT 2760 
CTGCAAGGCA TACCATGGCA GCTTTTCAAC TATGTGGOGA CCaTCCCTOA CAATTCCATA 2820 
GTGCTTATGG CATCAAAGGG AAGATAGGTC TCCAGAGGCC CATGGACXAG AGTGCTGGAA 2880 
AAGCTTQGGG CAGACAGGGG TCTCAAGTTG AAAGAGCAAA TGGCATTOGT TGGCTTCAAA 2940 
. . GGCAGCTTCC GGCCCATCTG GGTGACACTG GACACTGAGG ATCACAAAGC CAAAATCTTC 3000 
45 CAAGTTGTGC CCATCCCTGT GGT6AAGAA6 AA6AAQTTGT OAGGACAGCX GCCGOOOQQT 3060 
GCCACCTCOT GOTAGACTAT OAOGGTGACT CTTG6CAGCA GACCA6TGG6 G6ATGGCTGG 3120 
GTCCCCCAGC CCCTGCCAGC AGCTGCCTGG GAAGGCOGTG TTTCAGCCCT GATGGGCCAA 3180 
GGGAAGGCTA TCAGAGACCC TGGTGCTGCC ACCTGCCCCT ACTCAAGTGT CTACCTGGAG 3240 
^- CCCCTGGGGC GGTGCTGGCC AATGCTGGAA ACATTCACTT TCCTGCAGCC TCTT6GGTGC 3300 
50 TTCTCTCCTA TCTGTGCCTC TTCAGTGGG6 GTTTGGGGAC CATATCAG6A GACCTGGGTT 3360 
OTGCTGACAG CAAAGATCXA CTTTGGCAG6 AGCOCTGACC CAGCTAG6A0 GTAGTCTGGA 3420 
GGGCTGGTCA TTCACAQATC CCCATGGTCT TCAGCAGACA AGTGAGGGTG GTAAATGTAG 3480 
GAGAAAGAGC CTTGGCCTTA AGGAAATCTT TACTCCTGTA A6CAAGAGCC AACCTCACAG 3540 
^ GATTAGGAGC TGGGGTASAA CTGGCTATCC TTGGGGAAGA GGCAAGCCCT GCCTCTGGCC 3600 

55 GTGTCCACCr TTCAGGAGAC TTTGAGTGGC AGGTTTGGAC TTGGACTAGA TGACTCTCAA 3660 
AGGCCCTTTT AGTTCTGAGA TTCCAGAAAT CTGCTGCATT TCACATGGTA CCTGGAACCC 3720 
AACAGTTCAT GGATATCCAC TGATATCCAT GATGCTGGGT GCCCCAGCX3C ACACGGGATG 3780 
GAGAGGTGAG AACTAATGCC TAGCTTGAGG GGTCTGCAGT CCAGTAGGGC AGGCAGTCAG 3840 
GTCCATGTGC ACTGCAATGC OMSGTGGAGA AATCACAGAG AGGTAAAATG GAGGCCAGTG 3900 
OO CCATTTCAGA GGGGAGGCTC AGGAAGGCTT CTTGCTTACA GGAATQAAGG CTGGGGGCAT 3960 
TTTGCTGGGG GGAGATGAGQ CAGCCTCTGG AATGGCTCAG GGATTCAGCC CTCCCTGCCG 4020 
CTQCCTGCTG AAGCTGGT6A CTAOGGGGTC OCCCTTTGCT CAOGTCTCTC TGGCGCACTC 4080 
ATGATGGAOA AGTGXGGTCA GAGGGGAGGA ATGGGCTTTG CTGCTTATGA GCACAGAGGA 4140 
^ ATTCAGTCGC CAGGCAGCCC TGCCTCTGAC TCCAAGAGG6 TGAAGTCCAC AGAAST6A6C 4200 

05 TCCTGCCTTA GGGOCTCATT TGCTCTTCAT CCA6GGAACT GAGCACAGGG GGCCTCCAGG 4260 
AGACCCTAGA TGTGCTCGTA CTCCCTCGGC CTGGGATTTC AGAGCTGGAA ATATAGAAAA 4320 
TATCTAGCCC AAA6CCTTCA TTTTAACAGA TGGGGAAAGT GAG CCCC CAA GATGGGAAA6 4380 
AAOCAGACAG CTAAGGGAGG GCCTGQGGAG CCCCACCCTA GCCCTTGCTQ CCAC31CCACA 4440 
TTGCCTCAAC AAC066CCCC AGAGTGCCCA G6CACTCCTG AGGTAGCTTC TGGAAATGGG 4500 
/O GACAAQTCX:C CTCGAAGGAA AGGAAATGAC TAGAGTAGAA TGAC3M3CTAG CAGATCTCTT 4560 
CCCTCCTGCT CCCAGCGCAC ACAAACCOGC CCTCCOCTTG GTGTTGGCGG TCCCTGTGGC 4620 
CTTCACTTTG TTCACTACCT GTCAGCCCAG CCTGGGTGCA CAGTA6CTGC AACTCCCCAT 4680 
TGGTQCTACC TGGCTCTCCT GTCTCTGCAG CTCTACAGGT GAGGCCCAGC AGAOGGAGTA 4740 
GG6CT0GCCA TGTTTCTG G T GAGCCAATTT GGCTGATCTT GGOTGTCTGA ACAGCTATTO 4800 
/5 GGTCCACX:CC AGTCCCTTTC AGCTGCTGCT TAATGCCCTG CTCTCTCCCT GGCCCACCTT 4860 
ATAGAGAGCC CAAAGAGCTC CTGTAAGAGG GAGAACTCTA TCTGTGGTTT ATAATCTTGC 4920 
ACGAGGCACC AGAGTCTCCC TG6GTCTTGT GATGAACTAC ATTTATCCCC TTTCCTGCCC 4980 
CAACCACAAA CTCTT TCCTT CAAAGAGGGC CTGCCTGGCT CCCTCCACCC AACTG CACC C 5040 
ATQAGACTCS GTCCAAGAGT CCATTCCCCA GGTGGGA6CC AACTGTCAG6 GAGGTCTTTC 5100 
oO CCACCAAACA TCTTTCAGCT GCTGGGAGGT GACCATAGGG CTCTGCTTTT AAAGATATGG 5160 
CTGCTTCAAA GGCCAGAGTC ACAGGAAGGA CTTCTTCCAG GGAGATTAGT GGTGATGGAO 5220 
AGGAGAGTTA AAATGACCTC ATGTCCTTCT TGTCCA06GT TTT6TTGAGT TTTCACTCTT 5280 
CTAATGCAAG GG TCTCA CAC TGTGAACCAC TTA6GATGTG ATCACTTTCA G6T66CCAG0 5340 
AATGTT6AAT GTCTTTG6CT CAGTTC3VTTT AA7UVAAGATA TCTATTT6AA AGTTCTCAGA 5400 
o5 GTTGTACATA TGTTTCACAG TACAGGATCT GTACATAAAA GTTTCTTTCC TAAACCATTC 5460 
ACCAAGAGCC AATATCTAGG CATTTTCTTG GTAGCACAAA TTTTCTTATT GCTTAGAAAA S520 
TTOTCCTCCT TGTTATTTCT GTTTGTAAGA CTTAAGTGAG TTAGGTCTTT AAGGAAAGCA 5580 



PCTAJS02/12476 



354 



wo 02/086443 

ACGCTCCTCT GAAATGCTTG TCTTTTTTCT GTTGOOQAAA TAGCTG6TCC TTTTTOOGGA 5640 
GTTAGATGTA TAGAGTGTTT GTATGTAAAC ATrTCTTGTA GGCATCACCA TGAACAAAGA 5700 
TATATTTTCT ATTTATTTAT TATATGTGCA CTTCAAGAAG TCACTGTCAG AGAAATAAA6 5760 
AATTGTCTTA AAT6TCAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAA 



8eq ID NOt 451 Protein sequence 
Protein AccesBlon #t XP_051860.2 

1 11 21 31 41 51 

I I I I I I 

MDGVNLSTEV VYKKGQDYRF ACYDRGRACR SYRVRFLCGK PVRPKLTVTI DTNVNSTILM 60 

LBDNVQSWKP GDTLVIASTD YSMYQAEEFQ VLPCRSCAPN QVKVAGKPMY LHIOEEIDGV 120 

ra^RABVGLLS RNIIVM6EMB DKCYPYRNHI OrFFDFDTFG (3iXKFAL6FK AAHLEGTBLK 160 

BMGQQLVGQy FXHFRIiAGDV DERQGnCDPPT YXBDLSXBBT F8RCVTVB6S KQtiLIKDWG 240 

YNSLGHCFFT EOGPEERNTF DHCLGLLVKS GTLLPSDRDS KMCRHIT6DS YPGYIPICPRQ 300 

DCNAVSTFWM ANPNNNLINC AAAGSEETGF WFIFHHVPTG PSVGMYSPGY SEHIPLGKPY 360 

NNJIAHSNYRA GMIIDNGVKT TEASAKDKRP FLSIISARYS PHQDADPUCP REPAIIHHFI 420 

AYRNQDHOAW LRGGDVWLDS CRFADHGI6L TLASGGTFPY DDGSKQEIKN SIiFVGESGNV 480 

GTEMKDNRZH GPG8U)KSGR TIiPIGQ^PI RGIQIiYDGPI NIQNCTFRKF V ALBB RHTSA 540 

LAFRLNNAWQ SCFHNNVTGI AFEDVPITSR VFFGBPGPNF NQLDMDOTKT SVFEDVDGSV 600 

SEYPGSYLTK NDNWLVRHPD CINVPDWRGA ICSGCYAQMY IQAYKTSNLR MKIIKNDPPS 660 

HPLYLBGALT RSTHYQQYQP WTIiQKGYTI HWDQTAPAEL AIWLINFNKG DWIRVGLCYP 720 

RGTTFSILSD VBNRLLKQTS KTGVFVRTLQ MDKVEQSYP6 RSHYYWDEDS GUiFLKLKAQ 780 

NEREKFAFCS MKGCERZKIK ALIFKNAGVS DCTATAYPKE TERAWDVPM PKKLFGSQUC 840 

TKDHFIiBVKM ESSKQHFFHL WNDFAYIBVD 6KRVPSSEDG IQVWIDGEIQ GRWSHTSFR 900 

NSILQGIPWQ LFNYVATIPD NSIVLMASKG RYVSRGPWTR VLEKU3ADRG LKLKEQMAFV 960 
GFKGSFRPIW VTU>TBDHKA KIPQfWPIPV VKKKKL 

Seq ID NO; 452 DNA sequence 

tfucleic Acid Accession #: Eos sequence 

Coding sequence: 261.. 2861 

1 11 21 31 41 51 

I I 1 I 1 I 

GAGCTAGCOC TCAAGCA6A6 CCCftGCGOGG TGCTATOSGA CAGAGCCTGG CGAGOSCAAG 60 

CXK3CG0SGGG AQCCAGOGOQ GCTGAGCGCG GCCAGGGTCT GAACCCAGAT TTCCCAGT^ 120 

AGCTACCACT CCGCTT6CCC AaGCCCC3GGa AGCTOGCGGC GCCTGGCGGT CAGOtSACCSlG 180 

AC6TC0GGGG CCGCTQCGCT CCTGGCCCGC GAGGCGTGAC ACTGTCTCX3G CTACAGACCC 240 

AGAGGQAOCA GACIGCCAOO AT66GAGCTG CTGGGAGGCA GGACTTCCTC TTCAAGGCCA 300 

TGCTGACCAT CAGCTG6CTC ACTCTGACCT GCTTCCCTGG QQCCACATCC ACAQTGGCTG 360 

CTGGGTGCCC TGACCAGAGC CCTGAGTTGC AACCCTGGAA CCCTGGCCAT GACCAAGACC 420 

ACCATGTGCA TATCGGCCAG GGCAAGACAC TQCTGCTCAC CTCTTCTGCC AOGGTCTATT 480 

CCATCCACAT CTCAGAGGGA GGCAA6CTGG TCATTAAAGA CCACQACGAG CCGATTGTTT 540 

TGGGAACCCG 6CACATCCT6 ATTGACAACG GAGOAGAGCT GCATGCTGQG AGTGCCCTCT 600 

OOCCTTTCCA GG6CAATTTC AOCATCATTT TGTATGGAAG GGCTGATGAA GGTATTCAGC 660 

CG6ATCCTTA CTATGGTCTG AAGTACATTG GGGTTGGTAA AGGAGGCGCT CTT6ABTTGC 720 

ATGGACAGAA AAAGCTCTCC TQOACATTTC TGAACAAGAC CXTTTCACCCA G0TO6CMG0 780 

CAGAAGQAGG CTATTTTTTT GAAAGGAGCT GGGGCCACCG TGGAGTTATT OTTCATOTCA 840 

TCQACCCCAA ATCAGGCACA GTCATCCATT CPGACCGGTT TGACACCTAT AGATCCAAGA 900 

AAGAGAGTGA ACGTCTGGTC CAGTATTTGA AC3GCGGTGCC CGATGGCAGG ATCCTTTCTG 960 

TTGCAGTGAA TGATGAAGGT TCT06AAATC TOQATGACAT OGCCAQQAAO GOGATGACCS^ 1020 

AATTGGGAAG CAAACACTTC CTGCACCTTO GATTTAGACA CCCnGOAOT TTTGTAACTG 1080 

TGAAAGGAAA TCCATCATCT TCAGTGGAAG ACCATATT6A ATATCATG6A CATCGAGGCT 1140 

CTGCT6CTGC CCQGGTATTC AAATTGTTCC AGACAGAGCA TGGC3GAATAT TTCAATGTTT 1200 

CTTTGTCCAG TGAGTGGGTT CAAGACQTGG AGTGGACGQA GTGGTTCGAT CATGATAAAG 12 60 

TATCTCAGAC TAAA6GTGGG GAGAAAATTT CAGACCTCT6 6AAA6CTCAC CCAGGAAAAA 1320 

TATGCAATCG TOCXATTGAT ATACA6QCCA CTACAATGOA TGGASTTAAC CTCA6CAC0S 1380 

AG6TTGTCTA CAAAAAAGGC CAGGATTATA GGTTT6CTTG CTACXSACCGG GGCAGA6CCT 1440 

GCCX3GAGCTA COGTGTAOGG TTCCTCTGTG GGAAGCCTGT GAGGCCCAAA CTCACAGTCA 1500 

CCATTGACAC CAATGTGAAC AGCACCATTC TGAACTTGGA 6GATAATGTA CA6TCATGGA 1560 

AACCTGGAGA TAOCCTGGTC ATTGCCAGTA CTGATTACTC CATGTACCAG GGAGAA6A6T 1620 

TCGAGGTGCT TCGCTGCAGA TCCTG06CCC CCAACCAG6T CAAAGTGGCA GGGAAAGCAA 1680 

TGTACCTGCA CATGGG6GA6 GAGATAGACQ GOGTGGACAT GCGGGCGGAG GTTGGGCTTC 1740 

TQAGCCGQAA CATCATAGTG ATGGGGGAGA TGGAGGACAA ATGCTACCCC TAC3W3AAACC 1800 

ACATCTGCAA TTTCTTTQAC TTCGATACCT TTGGGGGCC3V CATCAAGTTT GCTCTGGGAT 1860 

TTAAQGCAGC ACACTTGGAG GGCA0GGA6C TGAAGCATAT GGGACAGCAG CTGGTGG6TC 1920 

AGTACCC6AT TCACTTCCAC CTGGCGGGTG AIGTAGAOSA AAGGGGAGGT TATGACCCAC 1980 

CCACATACAT CAGG6ACCTC TOCATCCATC ATAOVTTCrC TCGCTGOQTC ACAGTCCAT6 2040 

GCTCCAATGG CTTGTTGATC AAGGACGTTG TGGGOTATAA CTCTTTGGGC CACTOCTTCT 2100 

TCACGGAAGA TGGGCCGGAG GAACX3CAACA CTTTTGACCA CTGTCTTGGC CTCCTTGTCA 2160 

AGTCTGQAAC CCTCCTCCCC T0GGACC6TG ACAGCAAGAT GTGCAAGATQ ATCACAOAGG 2220 

ACTCCTACCC AGGGTACATC COC3WU3COCA 6GCAAGACTG CAAT6CTQT6 TCCACCTTCT 2280 

GGATG6GCAA TCCCAACAAC AACCTCATCA ACT6TGCG6C TGCAGGATCT GAGGAAACTG 2340 

GATTTTGGTT TATTTTTCAC CAC3GTACCAA CGGGCCCCTC OGTGGGAATG TACTCCCCAG 2400 

GTTATTCAGA GCACATTCCA CTGGGAAAAT TCTATAACAA CCGAGCACAT TCCAACTACC 2460 

GGGCTGGCAT GATCATAQAC AACGGAQTCA AAACCACCGA GGCCTCTGCX: AAGGACAAGC 2520 

GGCaiTT O CJ CTCAATCATC TCTGOCAaAT ACAGCCCTCA CX3VGGACGCC GACCCGCTQA 2580 

AGCCCCGGGA GCGGGCCATC ATCA6ACACT TCATTGCCTA CAAGAACCAG GACCACGGQG 2640 

CCTGGCTGCG CGGCGGGGAT GTGTGGCTGG ACAGCTGCCA TTTCAGAGGG GAGGCTCAGG 2700 

AAGGCTTCTT GCTTACAGGA AXGAAGGCTG GGGGCATTTT GCTGGGGGOA OATGAGGCAG 2760 

CCTCT6GAAT GGCTCAGGGA TTCAGCCCTC CCTGCCGCOXS CCTGCTGAAG CTGGTGACTA 2820 

OGGGGTCGCC CTTTGCTCAC GTCTCTCTGG CCCACTCATG ATGGAGAAGT GTGGTCAGAG 2880 

GGGAGCAATG GGCTTTGCTG CTTATGAGCA CAGAGQAATT CAGTCCCCAG GCAGCCCTGC 2940 

CTCTGACTCC AAGAGGGTGA AGTCCACAGA ASTGAGCTCC TGCCTTAGGG CCECATTT6C 3000 

TCTTCATCCA GGGAACTGAG CACAGGGGGC CTCCAGGAGA OCCTAQATOT GCTGGTACTC 3060 

CCTCX3GCCTQ GQATTTCAGA GCTGGAAATA TAGAAAATAT CTAOOCCAAA GCCTTCATTT 3120 
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TAACAGAT66 GGAAAGTGAO CCOCCAAGAT GGGAAAGAAC CACACA6CTA AGGGAG6GCC 
TGGGGAGCCC CACCCTAGCC CTTGCTGCCA CACCACATTG CCTCAACAAC CGGCCCCAGA 
GTGCXXAGGC ACTCCTOAGQ TAGCTTCTGG AAATGGGGAC AAGTCCCCTC GAAGGAAAGG 
AAATGACTAO AGTAGAATGA CAGCTA6CA6 ATCTCTTCCC TC CTGC TCCC AGCXK3VCACA 
AACCOGSCOCT OCOCTTGCrTG TTGGCGGTCC CTGIOGCCTT CACTTTGTTC ACTACCTGTC 
AGCrCAGCCT GGGTGCACAG TAGCTGCAAC TOCCCATTGG TGCTACCTGG CTCTCCTGTC 
TCTGCAGCTC TACAGGTGAG GCCCAGCAGA GGC5A6TAGGG CTOGCCATGT TTCTGQTC3AG • 
CCAATTTGGC TGATCTTGGG TGTCTGAACA GCTATTGQQT CCACCCCA6T CCCTTTCAGC 
TGCTGCTTAA TGCCCTGCTC TCTCCCTGQC CCACCTTATA GAGAGCCCAA AGAGCTCCTG 
TAA6A6GGAG AACTCTATCT OTOOTTTATA ATCTTGCA06 AGGCACCAGA GTCTCCCTGG 
GTCTT6T6AT GAACTACATT TATCCCCTTT CCTGOCCCAA CCACAAACTC TTTCCTTCAA 
AGAQGGCCTG CCTGGCTCXX: TCCACCCAAC TGCAOXATG AGACTOGGTC CAAGAGTCCA 
TTCCCCAGGT GGGAGCCAAC TGTCAGGGAG GTCTTTCCCA CCAAACATCT TTCAGCT6CT 
GGGAGGT6AC CATAGGGCTC TGCTTTTAAA QATATGGCTG CTTCAAAGGC CAGAGTCACA 
GGAAGGACTT CTTCCAGQGA QATTAGTGGT GATGGAGAGG AGAGTTAAAA TGACCTCATO 
TCCTTCTTGT CCAGGGTTTT GTTGAGTTTT CACTCTTCTA ATGCAAGGGT CTCACACT6T 
GAACCACTTA GGATGTGATC ACTTTCAGGT GGCCAGGAAT GTTGAATGTC TTTGGCTCAO 
TTCATTTAAA AAAGATATCT ATTTGAAAGT TCTCAGAGTT GTAC3^TATGT TTCACAGTAC 
AGGATCTGTA CATAAAAGTT TCTTTCCTAA ACCATTCACC AAGAGCCAAT ATCTAGGCAT 
TTTCTTGGIA GGACAAATTT TCTTATTGCT TAGAAAATTG TCCTCCTTGT TATTTCTGTT 
TOTAAGACTT AAGTQAQTTA GGTCTTTAAG GAAAGCAACO CTCCrCTOAA ATG CTTGT CT 
TTTTTCTGTT GCCGAAATAG CTGGTCCTTT TTCGGOAaTT AGATGTATAG AGTGTTTGTA 
TGTAAACATT TCTTGTAGGC ATCACCATGA ACAAAGATAT ATTTTCTATr TATTTATTAT 
ATGTGCACTT CRAGAAGTCA CrGTCAGAGA AATAAAGAAT TGTCTTAAAT GTCATGA7T6 
6AGATGTCCT TTGCATTGCT TGGAAGGGGT GTACCTAGAG CCAAGOAAAT TG6CTCTGGT 
TTGGAAAAAT TTTGCTGTTA TTATA6TAAA CATACAAAGG ATGTCAAAAA AAAAAAAAAA 
AAAAAAAAAA AAAAAAAAAA AA 



Seq ID NO: 453 Protein sequence 
Protein Accession #: Eos sequence 



1 
I 

MGAAGRQDFL 
GKTLLLTSSA 
TIZLYGHADE 
ERSWGHRGVI 
SRNLDDMARK 
KLFQTEHGEY 
IQATTMDGVN 
STILNLEDNV 
BIOGVDMRAE 
GTBLKHMGQQ 
KDWGYNSLG 
PKPRQDCNAV 
LGKFYNHRAH 
XRHFIAYKNQ 
FSPPCRCLLK 



11 
I 

FKAMLTISWL 
TVYSIHISBG 
GIQPDPYYGL 
VHVZDPKSGT 
AMTXLGSKHF 
FNVSLSSEW 
LSTEWYKKG 
QSWKPCaJTLV 
VGIiLSRNZIV 
LVGQYPIHFB 
HCPFTSDGPE 
STFWMANFMN 
SNYHAGMIID 
DKGAHIiRGGD 
LVrrOSPFAB 



21 

1 

TLTCPPGATS 
GKLVIKDHDE 
KYIGVGKGGA 
VIHSDRPDTY 
lALGFBHPHS 
QDVEHTEWFD 
QDYRFACYDR 
lASTDYSMYQ 
HGEMEDKCYP 
LAGDVDERGG 
ERNTFDRCLG 
NLINCAAAGS 
NGVKTrEASA 
VWLDSCEIFRG 
VSLAH8 



31 
I 

TVAAGCPDQS 
PIVLRTRHIL 
LELKGQKKLS 



FLTVKQWSS 
HDKVSQTKGG 
6RACRSYRVR 
AEBFQVLPCR 
YRKHICNFFD 
yDPPTyZRDL 
LLVKSGTZiLP 
EET6FWPIFH 
KDKRPFLSII 
BAQB6FLLTG 



41 

I 

PELQPffNPGB 
IDI3GGELHAG 
WTFLNKTLHP 
QYLNAVPDGR 
SVEDBIBYHG 
ERISDL14KAH 
PLCGKPVRPK 
BCAPNQVKVA 
FDTPGGHIKF 
8IRBTFSRCV 



HVPTGPSVGM 
SARYSPEQDA 
HKAGGZLLGG 



Seq ZD 210 1 454 DNA sequence 

Nucleic Acid Accession U-. NM_013282.2 

Coding sequence t 85.. 2466 



1 
I 

OQACTCCTTA 
GTCOCTCCCC 
AC0CACA080 
CA06AGCTGT 
6AGGA0GGCC 
GTCaSCCAGA 
ACCX3ACTCCG 
OCXS3COGCC6 
GGGCTGTACA 
GAGGG6CAGG 
AOGTCCAGGC 
GAGAAOGGCG 
AAGTGQCA6G 
AAGGAGCGGG 
CGGGAACTCT 
TTOGTGGAOG 
CCCATGAGAC 
TGCCGGGTCT 
TGCGATGAGT 
CCCAGOGAGG 
GCQGGAGAGG 
TCACAGCGGG 
GTCCCGTCCA 
CGAGTCCAGG 
AGCAACGAOG 
GGGAATTTTT 
GGGGAACAGT 
TTTGCTCCCA 
GTCAGGGTGG 
AACCX3CTA06 
TTTCTOGTOT 
QAG66GAAGG 



11 

i 

GAGCATGGCA 
TCAGCGCC6A 
TGGACTOGCT 
TCCACGTOGA 
ATACCCTCTT 
GCCTCGTGCT 
GCTGCTGCCT 
AGACTOACAG 
AGGTCAATGA 
TGGTCAGGGT 
CGC3CGCTGGA 
TGGTCCAGAT 
ACCTGGAGGT 
GCTTCTGGTA 
ACGCCAACX3T 
AAGTCTTCAA 
GGAAGAGCGQ 
GCX5CCTGCCA 
GC6ACATGGC 
ACQAGTGGTA 
GGCTGAGAGA 
ACTGGGGCAA 
ACCACTACGG 
TCAGCGAGTC 
GAGOSTACTC 
TCACATACAC 
CTTGTGATCA 
TCAATGACCA 
TGCGCAATGT 
ATGGCATCTA 
QGCGCTACCT 
ACOGGATCAA 



21 

I 

tggctcagag 
cacx:atgtg6 

GTCCAGGCIG 
GGCAQGCCTO 

CGACTAOGAG 
CCCCCACAGC 
GGGCCAGAGT 
CAGGCCASCC 
GTAGOTCGAT 
GACG0GGAA6 
GGAGGACGTC 
GAACTCXAGG 
G66CCAGGTG 
06AC6CGGA6 
GGTGCTGGGG 
GATTGAGOGG 
GCCGTCCTGC 
CCTGTGOSGG 
CTTCCACATC 
CTGCCCTGAG 
GAGCAAGAAG 
GGGCATGGCC 
ACCCATCXXX3 
GGGTGTCCAT 
CCTAGTCCTG 
GGGTAGTGGT 
GAAACTCACC 
AGAAGGGGCC 
CAAGGGTGGC 
CAAGGTTGTG 
TCTGCX3GAG6 
QAA6CTGG6G 



31 
I 

GTGCTGGTAA 
ATCCAGGTTC 
ACCAAGQT66 
CAOAGGCTGr 
GTCCGCCTGA 
ACCAAGGAGC 
GAGTCAGACA 
GATGAGGACA 
GCT06GGACA 
GCCCCCTCCC 
ATTTACCACG 
GACGTCCGAG 
GTCATGCTCA 
ATCTCCAGGA 
GATGATTCTC 
C06GGTGAAG 
AAGCACTGCA 
GGCOGGCAGG 
TACTGCCTGG 
TCCGG8AATG 
AAGGGGAA6A 
TGTGZGGGCC 
GGGATCCCCG 
CGGCCCCACG 
GCGGGGGGCT 
G6T0GAGATC 
AACACCAACA 
GAGGCCAAGO 
AAGAATAGCA 
AAATACTGGC 
GAC6ATGAT0 
CT6ACXIAT6C 



41 

I 

AACTGATGGG 
GOACCATGGA 
AGQAfiCIGAO 
TCIACASGGG 
ATGACACCAT 
GGGACTCCGA 
AGTOCTCCAC 
T6TGGGATGA 
CGAACATGGG 
GGGACGAGCC 
TGAAATACGA 
CGCGOGCCCG 
ACTACAACCC 
AGGGCGAGAC 
TOAACGACTG 
GGAOCCCCAT 
AGGACGACGT 
ACCCCGACAA 
ACCOGCCCCT 
ATGCCA6CGA 
TGGCCTCX36C 
GCACCAAGOA 
TGGGCACCAT 
TGGCTGGCAT 
ATGAGGATGA 
TTTC06QCAA 
GGG0GCTG6C 
ACTGGCGGTC 
AGTACGCCCC 
CC6ASAAGGG 
AGCCTGGCOC 
AGTATCCAOA 



3160 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4260 
4320 
4380 
4440 
4500 
4560 
4620 
4680 
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51 

I 

OQDHBVBIGQ 

SALCPFQGHF 
GGMABGGYFF 
ILSVAVNDEG 
HRGSAAARVP 
PGKIQIRPID 
LTVTZDTNVN 
GKPMYLHIGB 
ALGPKAABLE 
TVHGSMGLLI 
ITEDSYPGYI 
YSPGYSEHIP 
DPLKPREPAI 
DBAASGMAQG 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 



51 
I 

GGTTTTTGCT 
0GG6AGGCAG 
OCGGAAGATC 
CAAACAGATG 
CCAGCTCCTG 
GCTCTCCGAC 
CCACGGCGAG 
GACGGAATTG 
GGCGTGGTTT 
CTGCAGCTCC 
CGACTACCOG 
CACCATCATC 
CGACAACCCC 
CAGGACGGGG 
TOGGATCATC 
GQTTGACAAC 
GAACAGACTC 
GCAGCTCATG 
CAGCAGTGTT 
GGTGGTACTG 
CACATCOTCC 
ATGTACCATC 
GTGGCGGTTC 
ACACGGCOGG 
CGXG6ACCAT 
CAAGAGGACC 
TCTCAACTGC 
GGGGAAGCOG 
CGCTGAG6GC 
GAAGTC06GG 
TTGGAOGAAG 
AGOCTACCTO 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1060 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 



356 
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GAAGCCCTGG CCAACOGAGA GCGAGAGAA6 
CAGGAGGGGG GCTTOSOOTC CCCCAGGACG 
GGAGGTGGCC CGAGCftGGGC CGGGTCCCCG 
CCCTACAGTC TCACGGCCCA GCAGAGCAGC 
5 CTGTGGAATG AGGTCCTGGC GTCACTCAAG 
TTGTTCCTGA GTAAAGTQGA GGAGACGTTC 
CGGCCCATCA CGACOGTOTG CCAGCACAftC 
GGGGCACAG6 TGTTCAfiCTG CCCTGCCTOC 
CAGGTOAACC AGCCTCTGCA GACCGTCCTC 
10 C66TGATCTC CAAGCACTTC TCGACAGGCG 
CATOQQCaCT GA-TTTTGrTC TT AGTGGGC T 
CCTAAAAAOG TTTGTCTTCC TTTTTTTTTA 
GAATTTATGT ATTCTGGCTA AAAGTTGGAC 
CATAAAAGCC TGCAATTTCT OGACA AAACA 
15 ACTACX3TGGT GTGGAGGCTG TTGATGTTTC 
CAACTCTTTA AGAAGGCX3AC AG GATC AGTC 
AGCAAGCATC TTCCTGACAO CATTTTGTCA 
TGGCCGGTG6 CA6CC0STG0 CATG6GGTGG 
AAA6AGGAAA CATCTCGGGC CTAGTTCAAA 
20 T6CTTA6CGT CT6A6ATCCQ CGTGAAAAGT 
CACGCAOAAA TGGCCTCAAG GGOACTCTGC 
TGfTCCGACXSA RQGCGGCCAC GGACGGACGC 
OATTCGTTCC TTCTTTCTAA AGACGACAGT 
GTCAACCAGA TTCTAGAAAC TGCX3GTCATC 
25 GGAACCGTTT GAGCCTTATA GATCATTTAC 
CTTACAAGAG GGTTTTTTTT TAATTTTTTT 
TTTTTTTTGT AGTTACTGTA TATGTACCAA 
XTGTTTTTGT ATTTTTTTTC TTTTGAAAGG 
rTGCAGCXTA TACCTCAATA A AACAGGGAT 
30 GAGCAATGTT ATTTTTAAAG GOTTTTTTTC 
AGGGAAGAAT GAGACAATTT TGTGTAGGCT 
TTAQATTCTC AGAATAAATG TTTTTCACAG 



PCTAJS02/12476 

GAGAACAGCA AGAGGGAGGA GGA6GAGCA0 1980 

6QCAAGGGCA AGTGGAAGC6 GAAGTOGGCA 2040 

OGCCGGACAT CCAAGAAAAC CAAQOTGOAG 2100 

CTCATCAGAG AGGACAAGAG CAACGCCAAG 2160 

GACCX3GC0GG CQAGCGGCAG CCCGTTCCAG 2220 

CaGTGTATCT GCTGTCAGGA GCTGGTGTTC 2280 

QTGtOCAAGG ACTGCCTG6A CAGATCCTTT 2340 

CGCTAC3GACC TGGGCCGCAG CTATGCCATG 2400 

AACCAGCTCT TCCCCGGCTA CGGCAATGGC 2460 

TTTTGCTGAA AACGTGTCGG AGGGCTCJGTT 2S20 

TAACTTAAAC AGGTAGTGTT TCC TCCG TTC 2580 

TTTTTATTTT TCAAATCTAT ACATTTTCAQ 2640 

TTCTCAGTAT TGTGTTTAGT TCTTTQAAAA 2700 

ACACAAGATT TTTSMMSPa GGAATCAGAA 2760 

TGGTGTCAAQ TTCTCAQAAG TEGCTOCCAC 2820 

CTTCTCTAGG GTTCTGGCCC CCAAGGTCAG 2880 

TCTAAAGTCC AGTGACATGG TTCCCCGTGG 2940 

CTCAQCTGTC TGTTGAAGTT GTTGCAAGGA 3000 

CCTTTGCCTC AAAGCCATCC CCCACCAGAC 3060 

CCTCTGCCCA CBAGRGCAG6 GAGTTGGG6C 3120 

TCCACGTQGQ GCCAGGOtSTG TGACTGAOGC 3180 

CAGCACACGA AGTCAOGTGC AAGTGCCTTT 3240 

CTTTGTTGTT AGCACTGAAT TATTGAAAAT 3300 

CAGTTCTTCC T6ACACCQGA TGGGTGCTTG 3360 

ATTCAATTTT TTTAACTCA6 CAAGTORGAA 3420 

TTCTCTTAAT GAACACATTT TCTAAATGAA 34B0 

GAAAQATATA ACGTTAGGGT TTGGTTGTTT 3540 

GTTTGTTAAT TTTTCTAATT TTACCAAAGT 3600 

ATTTTAAATC ACATACXTTGC AGACAAACT6 3660 

ACCTCCTTAT TCTTAOATTA TTAATGTATT 3720 

TTTTCTAAA6 TCCRGTACTT TGTOCAGATT 3780 

ATTGAAAAAA AAAAAAAA 



seq ID NOs 455 Protein sequence 
35 Protein Accesaion «: IIP_037414.2 

1 11 21 31 41 51 

MMIQVRTMDG RQTHTVDSLS RLTKVEELRR KIQELPHVEP GLQRLFYRGK QMEDGHTLFD 60 

40 YBVRIiNDTlQ LLVHQSIiVLP HSTKBRDSBL SDTOSGCCW QSESDKSSTH GEAAAETDSR 120 

PADEDMWDET ELGLYKVNBY VDARDTNMGA WPBAQWRVT RKAPSRDBPC SSTSRPALEE 180 

DVIYHVKYDD YPBNGWQMN SRDVRARART IIKWQDLEVG QWMUJVNPD NPKBR6PWYD 240 

AEISRKRETR TARBLYANW LGDDSIilDCR IIFVDEVFKI ERPGEGSPMV DNPMRRKSGP 300 

SCKHCKDDVN RLCRVCACHL CGGRQDPDKQ LMCDECDMAP HIYCLDPPLS SVPSEDEWYC 360 

45 PBOaanASEV VLAGBRLHES KKKAKMASAT SSSQBDWGKG MACVGRTKEC TIVPSNHYGP 420 

IPQXPV6TMM RPRVQVSESG VHRPHVAGIH ORSHDGAYSL VLAGGYEDDV DHGNFFTYTG 4B0 

SGGRDLSQJK RTAEQSCDQK LTNTNRALAL NCFAPHTOQE GAEAKDWRSG KPVRWRNVK 540 

GGKNSKYAPA EGHRYDGIYK WKYWPEKGK SGFLVWRYIiL SRDDDEPGPH TKEGKDRIKK 600 

LGLTMQYPBG YLBALANRER EKENSKRBBB EQQBGGFASP RTGKGKMKRK SAOGOPSRAO 660 

50 SPRRTSKKTK VEPYSLTAlQQ SSMRBDKSN AKLWNBVLAS LKI»PASGSP FQtPLSK^^ 720 
TFQCICCQEL VFRPITTVCQ HNVCaOJCLDR SFRAQVPSCP ACRYDWRSY AMOVNQPLQT 

Seq ID KO: 456 ONA sequence 
Nucleic Acid Accession #t NM_001200.1 
55 Coding sequence! 32 5.. 1514 

1 11 21 31 41 51 

GGGGACTTCr TGAACTTGCA GGGAGAATAA CTTQCGCACC CCACTTTGCSS OCaOTOCCTT 60 

60 TGCOXAQCG GAGCCTGCTT CGCCATCTCC GAGCCCCACC GCCCCTCCAC TCCTCGGCCT 120 

TGCCC3GACAC TGAGACGCTC TTCCCAGC6T GAAAAGAGAG ACTGCGCGGC CG GCACC CGG 180 

GAGAAGGAGG AGGCAAAGAA AAGGAACGGA CATTCGGTCC TTGCQCCAG6 TCCTTTGACC 240 

AGAGTTTTTC CATGTGQAOG CTCTTTCAAT GQACOTGTCC COGCOTGCTT CTTAGACGGA 300 

CTGCGGTCTC CTAAAGOTCO ACCATGQTQO CCGGOACCOG CTGTCrTCTA GCGTTGCTGC 360 

65 TTCCCCAGGT CCTCCTGGGC G606CG6CTG 6CCTCGTTCC GGAGCTGGGC CGCAGGAAGT 420 

TCGOKCXMC GTCGTCGGQC CGCCCCTCAT CCCAGCCCTC TGACGAGGTC CTGAGCGAGT 480 

TCGAGTTGCG GCTGCTCAGC ATGTTCGGCC TGAAACAGAG ACCCACCCCC AGCAGGGACG 540 

CCGIGGTGCC CCOCTACATG CTAGACCTGT ATCGCaGGCA CTCAGGTCAG CCGGGCTCAC 600 

CCQCCCCAQA CXaCCGGTTQ C3AGAGGQCAQ CC»QCCGACC CaACACTGTG CGCAGCTTCC 660 

70 ACCATGAAGA ATCTTTGGAA GAACTACCaG AAACGAGTGG GAAAACAACC CXWAGATTCT 720 

TCTTTAATTT AAGTTCTATC CCCACGGAGG AGTTTATCAC CTCAGCAGAG CTTCAGGrPTT 780 

TCC3GAGAACA GATGCAAGAT GCTTTAGGAA ACAATAGCAG TTTCCATCAC OGA ATTAATA 840 

TTTATOAAAT CATAAAACCT GCAACAGOCA ACTOGAAATT CCCCGTGACC AGACTTTTGG 900 

ACACCAGGTT GGTOAATCAG AATGCAA6CA GGTGGGAAAG TTTTGATGTC ACCCCCGCTG 960 

75 TGATGCGGTG GACTGCACAG GGACACGCCA ACCATGGATT OGTGGIGGAA 6TGGCCCACT 1020 

TGGAGGAGAA ACAAGGTGTC TCCAAGAGAC ATGTTAGQAT AAGCAflCTCT TTGCACCAAG 1080 

ATGAACACAG CTGGTCACAG ATAAGGCCAT TGCTAGTAAC TTTTG6CCAT GATGGAMjAG 1140 

GQCATOCTCT OCACftAAAGA GAAAAACGTC AAGCCAAACA CAAACAGCGG AAACX3CCTTA 1200 

AGTCCAQCTO TAAGAGACAC CCTTTGTACG TGGACTTCAG TGACGTGGGG T GGAATG ACT 1260 

80 GGATTGTGGC TCCCCCGGGG TATCACGCCT TTTACTGCC3V CGGAGAATGC CCTTTTCCTC 1320 

TGGCTGATCA TCTGAACTCC ACTAATCATG CCATTGTTCA GAOQTTGGTC AACTCTOTTA 1380 

ACTCTAA6AT TCCTAAGGCA TGCTGTGTCC OGACAGAACT CAGTGCTATC TOWTMTOT 1440 

ACCTTCACGA GAATGAAAAG GTTGTATTAA AGRACIATCA QQACATGGTT CSFTOGAGOGmr 1500 
aTGOOTOTCG CTAOTACAQC AAAATTAAAT ACATAAATAT ATATATA 
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1 11 21 31 41 51 

I I I I 1 I 

MVAGTRCLIA IJiLPQVLLGG AA6LVPELGR RKPAAA8SGR PSSQFSDEVL SBFELRIiLSM 60 

FGLXQRPTPS BSAWPFYML DLYRBBSGQP GSPABDHRLB XtAASRAMTVR SFHHEBSLBB 120 

LP&TSGKTTR RFFFNIiSSIP TGBFITSAEL QVFREQMQDA LGNNSSFMHR INZYEZZKPA 180 
TANSKPPVTR LLDT 

Seq ID NO: 458 ONA sequence 

Nucleic Acid Accession #i NM_0D1999.2 

Coding sequence t I . . 873 6 ' 

1 11 21 31 41 51 

I I I 1 I 1 ' 

ATGGSQAGAA GAOQQAGGCr OTGTCTCCAa CPCTACTTOC TGTC6CTGGG CTGTGTGGTG 60 

CTCTGG6CGC AGGGCACGQC OGQGCASOCT CAGGCTCCTC 08CCCAAGCC GCCQ CGGCCC 120 

CAGCXS3CC6C OGCAACAGGT TOGGTCCGCT ACAGCAGGCT CTOAAGGCGG GTTTCTAGOQ 180 

CCCGAGTATC GCOAGGAQGG TGCCGCAOTG QCCAQCCGCG TCCGCCGGCX5 AGGACAGCAG 240 

GAGQTGCTCC GAGGGCCCAA CQTGTGC6GC TCCAGATTCX: ACTCXTTACTG CTGCCCTGGA 300 

T06AAGA06C TCCCIOGAGG AAACCAGT6C ATTGTCCOGA TTTGTAGAAA TAGTTGTGQA 360 

GATGGATTTT GTTOCOGTCC TAACATGTGT ACTTSTTCCA GTG66CAAAT ATCATCAACC 420 

TGTGGATCAA AATCAATTCA 6CAGTGCA6T GTGAGATGCA TGAATGGTGG OACCTGTGCA 480 

GATGACCACT GCCAGTGCCA GAAAGGATAT ATTGQAACTT ATTGT6GACA ACCT6TCTGT S40 

GAAAATGGAT GTCAOAATGG TGGACGTTGC ATCGCCCAAC CX3TGTGCTTG TCTTTATGOO 600 

TTCACTGOTC CACAGTGTGA AAGAGATTAC AGGACAGGCC CGTGTTTCAC TCAGGTCAAC 660 

AACXaGATOT GCCAAGOGCA GCTGACAGGC ATTGTCTGCA CQAAGACTCT GTGCTGTGCC 720 

ACCACTGQAC GGGCGTGGGG CCATCCCTGT GAGATGTGTC CAOCCCAOCC TCA6CCCTGC 780 

OGACGGGGTT TCATCCCCSUV CATCXXSCACT GGAGCTTGCC AAQATGTTGA TGAAT G0C3^G 840 

GCTATCCCAG GGATATGCCA AGGAGGAAAC TGTATCAATA CAGTGQGCTC TTTTGAATGC 900 

AGATGCCCTG CTGGTCACAA ACAGAGTGAA ACTACTCAGA AATGTGAAGA CATTGATGAG 960 

TGCAGCATCA TTCCTGGGAT ATGT6AAACT GGTGAATGTT CCAACACC3GT GGGAAGCTAT 1020 

TTTTGTGTTT GTCCACX3TG0 ATATGTAACC TC3UICAGATG GCTCTOGATG CATOGATCAG 1080 

AOAACAGGCA TGTGTTTCTC GGGCCTGGTG AATGGCC6CT GTGCACAnSA GCTGCGQGQG 1140 

AGAATGACGA AAATGCAOTQ CTGCTGT6AG CCT6GCJCGCT G CTGGGG CAT C3QGAACCATT 1200 

CXTOAAOCCT GTCCTOTCAG AGGTTCTGAG GAATATCGCA GACTTTGCAT GQATGGACTT 1260 

CCAATGOQAG QAATTCCAGG GAGTGCTGGT TCCAGACCTG GAGGCACTGG GGGAAATGGC 1320 

TTTGCCCCAA GTGGCAATGG CAATGGCTAT GGCCCAGGA6 GGACAG6CTT CATOCCCATC 1380 

CCTGGAGGCA ATGGCTTTTC TCCTQGOGTT GGG6GA6C0G OTGTOGflS Q C OGGGGQACAG 1440 

GGACCTATCA TCACTGdACT AACAATTCTQ AACCAOACAA TAOATATCTQ TAAOCATCAT 1500 

GCTAACCTTT GTTTAAATGG ACGCTGTATA CCAACTGTCT CAAGCTACC6 ATGTGAATGC 1560 

AACATGGGTT ATAAGCAGGA TGCAAATGGA GATTGTATAG ATGTTGATGA ATGCACATCA 1620 

AATCCCTGCA CTAATGGAGA TTOTGTTAAC ACACCTGQTT 0CTATTATT6 TAAATGTCAT 1680 

GCTG6ATTCC A6AGGACT0C TACCAAOCAA GGATGCATTG ATATTGATGA GTGCATCCAG 1740 

AATGGGGTTC TTTGTAAAAA CGGT0QA1GC GTGAACTCA6 ATGGAAGTTT CCAGTGCATT 1800 

TGCAATGCCO GCTTTQAATT AACTACAQAT GGAAAAAACT GTGTTGATCA TGATGAATGT 1860 

ACAACTACCA ACATGTGTTT GAATGQAATG TGCATCAATG AAGATGGCAG OTTCAAGTGC 1920 

ATCTQCAAAC CAGOATTTGT CTTGGCTCCA AATGGGCGTTT ACTGTACTGA TGTTGATGAA 1980 

TGCCAOACCC CAGGAATCTG CATOAATGGO CACTGCATCSi ACAOTQAAGO OTCCTTCOGC 2040 

TGTGACTGTC CCCCAG6CCT GGCTGTGGGC ATG6ATGGAC GTGTGTGTGT TGATACTCAC 2100 

ATGOGCAGTA CCTGCTATGG AGGAATCAAG AAAGQAGTGT OTGTQCOTCC TTTCCCOSGT 2160 

GCRGTGACCA AGTCCGAATG CTGCTGTGCC AATCCAGACT ATGGTTTTGG AGAACCCTGC 2220 

CAGCCATGCC CTGCAAAAAA TTCAGCTGAA TTCCAOQGOC TTTGTAGTAG TGGAGTAGGT 2280 

ATCACTGTG6 ATGGAAQAGA TATCAATGAA TGTOCTTTGG ATCCTQATAT ATGIOCCAAT 2340 

GG6ATTTGT6 AAAACTTACX3 TG6TAGTTAC OGTTOTAATT GCAACAGTGG CTAT6AACCA 2400 

GATGCCTCTG GAAOAAACTG TATTGACATT GATGAATGTT TAGTAAACAG ACTGCTTTGT 2460 

GATAACGGAT TGTGCCGAAA CACX3CCAGGA AaTTACAGCT GTAOGTGCCC ACCAGGGTAT 2520 

GTQTTCAGGA CTGAQACAGA GACCTGT6AA GATATAAATG AATGTGAAAG CA ACCC ATGT 2580 

GTCAAT6G06 CCTGCAGAAA CAACCrTGGA TCTTTCAATT GT6AATGTTC GCCOGGCAGC 2640 

AAACTCAOCT CCACA6GATT OATCTOTATT GACAGCCTQA A0GGGACCT6 TTGGCTCAAC 2700 

ATCCA6GACA GCCGCTGTGA GQTGAATATT AATGGAGCCA CTCTGAAATC TGAATGCT6T 2760 

GCCACCCTCG GAGCCGCCTG GGGGAGCCCC TGTGAGCGGT GTGAACTAGA TACAGCTTGC 2820 

CXIAAGAGGGC TTGCCAGGAT TAAAGGT6TT ACX3TGTGAAG ATGTTAATGA GTGTGAGGTG 2880 

TTCCCTGOOO TTTGICCAAA TGQAOSCTGT GTCAACAGTA AGGGATCrTT TCATTGC6AG 2940 

TGC0CT6AAG GC3CTTACGTT 6GATGGGACT GGCOQTGTAT OTTTGGATAT TCGCATGGAG 3000 

CAGTGTTACT TGAAGTGGGA TGAAGATOAA TGCATCCACC CCGrrTCCTGG AAAGTTCCGC 3060 

ATGGATGCCT GCTGCTGTGC TQTCX3QGGCG GCTTGGGGCA CCGAGTGTGA GGAGTGCCCC 3120 

AAACCTGGCA CCAAGGAATA OQAGACACTG TGCCCCCGOG GGGCTGGCTT TGCTAACCGA 3180 

6GGGATGTTC TTACTGQGOG 6CCATTTTAC AAAGACATCA ATGAATGCAA AGCATTTCCT 3240 

GGGATGTGCA CTTATGGOAA GTGCAGAAAT ACAATCX5GAA GCTTCAAATG CCGTTGCAAT 3300 

AGTGOCTTTO CTCTAGACAT GGAGGAAAGA AACTGCACGG ACATCGACGA GTGCAGGATT 3360 

TCTCCTGACC TCTGTGGCAG TGGAATCTGC GTCAATACAC CGGGCAGCTT TGAGTGCGAG 3420 

TGCTTOOAAa GCTATOAAAG TGGCTTCATG ATGATGAAGA ACTGCATGGA CATTGA06GA 3480 

TGTGAACGTA ACCCTCTCCT TTGTAGGGGT GGCACCT01X5 TQAACACTGA GGGCAGCTTT 3540 

CAGT6TGACT GCCCACTG66 ACACGAGCTG TCACCATCCC GTGAGQACTG TGTGQATATT 3600 

AATGAATGCT CCCTGAGTGA CAATCTCTGC AQAAATGGAA AATGTGTGAA CATGATTGGA 3660 

ACCTATCAGT GCTCTTGCAA TCCTGGATAT CAGGCTACGC CAGACCGCCA GQGCIOTACA 3720 

GATATTGATG AATGTATGAT AATOAACGGA GGCTGTOACA CCCBM3TGCAC AAATTCA6AG 3780 

GGAAGCTACG AAT6CAGCTG CAGTGAGGGT TATGCCCTGA TGCCAQATGG GAGATCGTGT 3840 

GCAGACATTG ATGAATGTGA AAACAATCCT GATATCTGTG ATGGOGGCCA GTGTACCAAC 3900 

ATTCCTGGAG AGTATCGCTG CCTCTGCTAT GATGGCTTCA TX3GCTTCCAT GGACATGAAA 3960 

ACATGCATT6 ATGTCAATGA ATGTGACCIA AATTCAAATA TCTGCAIGTT TGGOGAATGT 4020 

QAGAACACAA AGGGATCCTT CATTTGCCAC T6TCA6CTG6 GTTACTCAGT GAAGAAGGGG 4080 

ACCaCAGGAT QTACAGATGT GGATGAGTGT GAAATTGGTG CTCATAACTG CGACATGCAT 4140 

GCCTCATOTC TGAATATCCC AGGAAGCTTC AAGTGTAGCT QCAGAGAAGG CTGGATTGGA 4200 

AACGGCATCA AGTQTATTGA TCTGGACGAA T6TTCTAATG QAACCCACCA GTGTAGCATC 4260 

AATGCTCAGT GTCTAAATAC C006GGCTCA TACCGCTGTG CCTGCTCCGA AOGTrTCACT 4320 

GGTGATQGCT TTA0CT6CTC AGAIGTTGAT GAaTGTSCAG AAAACATAAA GCTCTGTGAjQ 4380 
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AACX3GACAGT GCCTTAATGT 
CCAGCCTCAG ACAGCAQATC 
GTCTCrrGGAA CATQTAATAA 
GAftTTGGACA GftACftGGftTO 
5 TGfTOTCAATO QCCTATQTGT 
TTTCAGTTGA ACCCAACTGG 
AAGTTTGGAC CTCX3AGGAGA 
AGTCGCTCTT CATGCTGCTG 
CCCCCTGTCA ATAGCACTQA 
10 AACCCCATCA C3WWJCATTTT 
CAGGGTGGAA ACTGCATCAA 
TACCTCAGCQ AGGATACCCG 
GTGTGTGGGC CTGGGACCTG 
GAGTACATGC AGGTCAATGG 
15 CQAAGCTATA ATGGAACCAC 
TGCTGCTGCA CATATAATGT 
CCAGGAACAG CTGACTTTAA 
CACACAGGAA AAGCTGTTGA 
GGTOTOTOCA TTAACCAGAT 
20 AATOACCTGC TGTTGGTTTG 
CAGCGGAATG CAGACTGCAT 
TTCAAACTTT CACCCAATGO 
GTTT6CAGTC ATGGCTTQTG 
GGCTTTAAGG CTTCTCAQGA 
25 CCATGTGGAA ATGGAACTTG 
GGGTTTGAAC TCACTCATAA 
GGTCAGGTOT GCAOAAATGG 
AACGAAG6TT AT6AACTTAC 
GCCCTTCCOG GCTCTTGCTC 
30 ATCTGTCCCC CAGGGTATGA 
GAAGATOCCA ACATTTGTCT 
CTCTGCCCCC CTGGCTTTGT 
AGCTTCT 6 CT TCACAAATTT 
ACAAAAGCRA AATGCTGCTG 
35 CTGTGCCCCA AAGACGATGA 
GTCCCTA(?rC TTCATGATAC 
TGTTC3VAATG GTCAATGTAT 
TACAACCTTG ACTACACTGG 
CCGTGTGGAA ATGGTACATG 
40 GGCTTTGAGC CAGGGCCCAT 
CTGCTGTQTG CTTTACGCTG 
OGCTATGCCC TCAfiGQAAGA 
TTACACGACT QTGAATCTAG 
ATCT6CCCTC CTGGAATGGC 
45 TGCAGGACCA AGCCAGGAAT 
AOATGTGAGT GTAATGAAGG 
CQACMSOGTC TCTGCXTTGC 
CGCAATCrCG TCACTAAGTC 
TGOGAGCTTT GCCCACTTCC 
50 GGATATACAA CTGATGGAAQ 
AATGGTCA6T GCATCAATAC 
ACAGACATCA GTGGAACCTC 
TOCAACTACA TCT6CAAGAA 
GTCCTGCAAG AGGATGGAAA 
55 AACTGCCAGT TCCTCTGTGT 
TTCACACAGC ATCACACTGC 
TGTGGAGGAA AGGGAATCTG 
GGGTTCTCTC TTGATGCCAC 
CACAGGTGCC AACACGGCTG 
60 GGCTACATCC AGCACTACCA 
AATGCCTGTG GCTCTGCTTC 
TCGGGGTTCr CCTTOGACCA 
TCCAAGAACC CCTGCAATTA 
CCCCCTGGGT ATTACAGAGT 
65 GGGCA6TACC TSTCACTGGA 
TGCTACGAGT GCAAAATCAA 
CATGAAGCTG ATCCCACTGC 
CCCGTCAACA TGAAGTTCAA 
AGGCCCGOCA TCCAGCCCCT 
70 QACAIOGSTCT TCOSCATCCA 
AAGCTCAT6C CCGGCACATA 
GAGCTTAAGA AACTGGAAGA 
GCTCTCAGAA TGAGGCTGCA 
TCAAATCCTA GCACAGCCAG 
75 AQGAAAAATA ATAATAACTC 
CTCACA6G6A GGGATAATTT 
GTGGTTACTG TATTTTTTAT 
TCAAGATATC AGCATATGGC 
CCTGTTAGCA GTCTGTAACA 
80 GATGTTTATT TATTTTTAAT 
AAGGGAAACT CACTTGTTTT 
CTTTTTAAAA ATCCAATAGA 
CTGATACACA CCTGATCGAT 
TCAATAATTT AAAAGACAT6 
85 CAGCTCATTT GTGACAACAT 
CAACCACTGT AGCAAAATAC 
GTACTGTATT TCCTTCTCAT 
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CCCX3GGTGCA 
CTGCCAAGAT 
CCTGCCTGGA 
GAACTGTACA 
CAACACGCCT 
TGTGGGTTGT 
TGGGAGTCTG 
CTCTCTGGGA 
ATATTACACC 
AGAAGACATT 
CACTTTTGGG 
CATCTGTGAG 
CTATAACACC 
AGGCCACAAC 
TTGTGAGAAT 
GGGCAAAGCT 
AACCATATGT 
CATT6ATGAA 
TGGCAGTTTC 
TGAAGATATA 
CAATAGTCCr 
GQCCTGTGTA 
T8TTGATCT6 
CCAGACCATG 
TAAAAACACC 
TAATGATTGC 
ACGTTGTTTT 
COCAGATGGC 
TCCTGGTACC 
AGTAAAAAGC 
TTTTGGTTCC 
ACTATCIGAT 
T6AAAATGGA 
TA6TAAGAT6 
AGTTGCATTT 
AOOTGAAQAT 
CAACA0CX3AC 
AQTACXXnfGT 
CACCAATGTT 
GATGAATTGT 
CAT6AACACT 
TCAAAAGATG 
GGGCATGATG 
CCGAAGGCCC 
CTGTGAAAAT 
ATTCCAOTCA 
AGAGGTACTO 
AGAATGCTGC 
TGGAACT6CC 
AGATATTGAT 
CATGGGCTCA 
TTGTATAGAC 
CACT6AGGGG 
GACATGCAAA 
CAACACCCTG 
TTGTATOGAC 
TCAAAACACT 
CGGACTGAAC 
CCAGAACATC 
GT6GAATCA0 
CTGCTACAAC 
GTTCTCCAGT 
CGGCTGCTCT 
GGGACAAGGC 
TACAGAGGTC 
CGGCTATCCT 
TGTTGAACAG 
CCTCTCCCAC 
CAACAACCAC 
CCAAAGGAAT 
CACACTG6AA 
GAGCAAT6AG 
GATTCAfiCTC 
TCTGCAGAAG 
TTGTTTCTTT 
AGACTCTGGT 
ATAACTTCAT 
ACTAAATGCA 
CTTTGGGTAT 
GCAGTAATAT 
TCTTTA6ATT 
TACAA6AGAT 
TTTAAAGAAA 
AATGTCATTA 
TTCATATCAC 
CTTGACTGCT 
AACCTCAAGG 



TATCGCTGCG 
ATTGATGAAT 
ATGTTTCATT 
GATATTGATG 
GGTC6CTATG 
QTTGACAACC 
TCTTGCAACA 
AAGGCCTGGQ 
CTGTGTCCCG 
GACX3AATGCC 
AGCTTCCAer 
GATATTGATG 
CTGGGAAATT 
TGCATGGACA 
GA6TTGCCTT 
GGGAACAAAC 
6GAAATATTC 
TGTAAA6AGA 
CGCTGTGAAT 
GAT6AGTQCA 
GGTAGTTACC 
GATOSCAATG 
CAAG6AAGTT 
TGCATGGATG 
GTTGGATCCT 
CTGGACATA6 
AATGAAATTG 
AAAAACT6TA 
TGTCAGAATT 
GAGAACTGCA 
TGTACTAATA 
AATOGAGGGA 
AAGTGTTCTG 
CCAGGAGAGG 
CAGQATTTQT 
GTCAAT6AGT 
GGATCTTTTC 
GTGGATACTG 
ATTGGGAGTT 
GAAQATATCA 
TTTGGGTCCT 
TGCAAAQATC 
TGTAAGAATC 
GATGGAGAAO 
GGACQTTGTQ 
AGTTCTTCAQ 
CAGACAATAT 
TGTGATGGTG 
CAGTACAAAA 
GAATGTAAGO 
TTC06ATGCT 
CTTGATGAAT 
AGTTATCAQT 
GACCTTGATG 
GQGGGGTTTA 
AACSMCGAAT 
CCAGGCAGTT 
TGTGAAGATG 
CTGQGTGGCT 
TGTGTOQATG 
ACCCTG6GGA 
GCCTGCCAOG 
AACAOGGAGG 
CACTGTGTCT 
GATGAG6AAA 
AAGAAA6ACA 
ATCAGCCTAG 
CTCX3GCTCTA 
ATCCGTTATG 
GGGCTCAQCT 
ATCACTAGCA 
6AT6ACTACC 
TATTAACGGT 
CATTTGAAAA 
CCTCCCTGTC 
ATGGCCAAAG 
TTTAAAATAT 
CAAAAATAAT 
TTTGCTATAG 
ATGGAGAAAT 
TATAAATTTG 
6TTTCCTTTG 
GCCACACAGA 
6ATCCTTOAT 
CAGACACACC 
TGTGAGACCA 
AACCATATGT 



AGTGTGAGAT 
6CTCCTT0CA 
GCATCTGCXJA 
AQTGTGCAGA 
AGTGTAACTG 
GTGTGGGOUl 
CCGAGATOGO 
GAAACCCCTG 
GAGGT6AAGG 
AGGAGTTACC 
6TGAGTGCCC 
AGTGTTTTOC 
ACACCTGCAT 
TGAQAAAAAG 
TCAATGTGAC 
CTTGTQAACX: 
CTGGATTCAC 
TTCCAGGCAT 
GCCCTACAGG 
GCAATGGTGA 
GCTGT6AATG 
AATGTTTAQA 
ACCAGTGCAT 
TTGATGAGTG 
ATAACTGTCT 
ATGAGTGCAG 
GTTCTTTCAA 
TAGACACTAA 
TGGA6GGATC 
TTGATATAAA 
CTCCAGGGGG 
QATGCTTTGA 
TACCCAAAGC 
GCTGGGOGGA 
GTCCATATGG 
GTCTTGAGAG 
GCTGTGAATG 
ATGAGTGTTC 
TT6AATGCAA 
ACGAATOTGG 
AT6AATGCAC 
TGGA7GAAT6 
TAATCGGCAC 
GCTCTGTA6A 
TTAACATTAT 
GCACTGAATG 
GTCAAATGGC 
GGCGAGGCTG 
AGATATGTCC 
TAATQCCAAA 
TCTGCAAOGT 
GCTCCCAGTC 
GTTCaVTQTCC 
AATGTCAAAC 
GCTGTAAATG 
GTGGGTCTCA 
TCAGCTGTGA 
TTGATGAATG 
ACAGATGTGG 
AGAAT6AAT6 

gttAcaagtg 
acgtgaatga 

GGGGCTACCr 
CAGGAATGGG 
ATGCTCTGTC 
6CAGGCAGAA 
AGAGTGTCX3A 
AGGAGCACAT 
TCATCTCTCA 
ACTTGCACAC 
TCCCTCTCTA 
TCCTAGGGGA 
TCACAGACTT 
GTCAAGOACT 
TTAGACTTTG 
ATTTGAQCTC 
ATTAAAAGAA 
QTQAGCTTTT 
TTGCTAATTA 
GAACAAACTA 
AGCTATTTTT 
GTTTTCTGCC 
QCTGAATGGG 
AACX3TAGATC 
AGGCAACAGA 
TTAGCATTGC 
GCTACCCACA 



6G6CTTCACT 
AAACATTTGT 
TGATGGTTAT 
TCCTATAAAC 
CCCACCCGAT 
CTGCTACCTG 
G6TGG6CGTC 
T6AGACATGC 
CTTCAGACCT 
A6GTCTCTGC 
ACAAGGCTAC 
ACATCCTGGT 
TT6CCCACCT 
CTTTTGCTAC 
AAAAAGGATG 
ATGCCCAACT 
CTTTGACATT 
TTGTGCAAAT 
ATTCAGTTAC 
TAATCTCTGC 
T6CCGCGGGT 
AATTCCTAAC 
CT6CCACAAT 
CGAGCGGCAC 
GTGCTACCCA 
TTCCTTTTTT 
GTGTCTATGT 
TGAGTGT6TC 
CTTCAGATGC 
TGAATGTGAT 
CTTCCAGTGC 
TACTCGCCAG 
TTTCAACACC 
CCXXrrGTGAG 
CCATGGAACT 
CCCAGGCATT 
TCX3VATGGGC 
AATCGGCAAT 
TTGCAATGAA 
CCAGAACCCA 
GTGCCGQATT 
TGCTGAAGGG 
CTTCATGTGC 
TGAAAATGAA 
TGGAA6CTAT 
CCTTGACAAT 
ATCCAGTAGT 
GGGCCACCAG 
TCATGGCCCA 
CCTCTGCAOC 
TGGCTACAGC 
CC06AAACCA 
GAGGGGGTAT 
AAAGCAGCAT 
TCCACCTGGT 
ACCTTTGCTT 
ATGCCAAAGA 
T6ATGGGAAC 
CTGCCXX:CAA 
CTCC3VATCCC 
C6CCTGCCCC 
6TGCT06TCC 
CTGTGGCTOC 
ATTTAACAAG 
CCCAGAAGCA 
GAOAAGTATT 
CATGGACAGC 
CCTGGAACTA 
AGGGAACGAT 
GGCCAAGAAG 
CAAGAAGAAG 
6CTTGG66AG 
GGGCCCA6QC 
AATTTTAAAO 
AATGTTGACC 
AAAGGCAACC 
ACCTAAATGT 
TTTTTTTTTT 
AAAAAATATA 
TGTAAACAAA 
TTTAGAGGTG 
AGTCATCCAG 
QCAGTGCTAA 
GAAGCCAAAG 
AGTTGAA6CA 
AGGCCAAACC 
ACACCTCATT 



4440 
4500 
4560 
4620 
4680 
4740 
4800 
4860 
4920 
4980 
5040 
5100 
5160 
5220 
5280 
5340 
5400 
5460 
5520 
5580 
5640 
5700 
5760 
5820 . 
5880 
5940 
6000 
6060 
6120 
6180 
6240 
6300 
6360 
6420 
64B0 
6540 
6600 
6660 
6720 
6780 
6B40 
6900 
6960 
7020 
7080 
7140 
7200 
7260 
7320 
7380 
7440 
7500 
7560 
7620 
7680 
7740 
7800 
7860 
7920 
7980 
8040 
8100 
8160 
8220 
8280 
8340 
8400 
8460 
8520 
8580 
8640 
8700 
8760 
8820 
8880 
8940 
9000 
9060 
9120 
9180 
9240 
9300 
9360 
9420 
9480 
9540 
9600 
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CTTACCCAGG 6TGC6CTG0G TCCTCATG6T ACTGTA6GCA GCTGAAGAAC CGCGGTTCCX: 9660 
TTGAAAG60A ACACCT6GCA TTCTGTGGT6 TTT0GT6CTG TCTTAAATAA TGGTGCATTT 9720 
ATTATGTTCA AGTTATTTCa GGATTGCCAT ATGTGCAAAC AAATCATGCA ATGCAfiCXaA 9780 
GGAATATATG TTGTTGTTGT TGTTTTAAAC CCATTTTTTT TTTAGAATTT TCATTAATAC 9840 
T8TA6TTATA CACCATATGC CTC ATTT TAT CATASOCTAT 1GTGTATGAA AGATGTTTGT 9900 
ACAATGAATT GATGTTTAGT TTGCTTTAGT CATTTAAAAA OATATTOTAC GAGGATSTGC 9960 
TATTAAGAGC ACGTATCCAT TATTCTTCTC AACCCAAGAA CCTGTTTCCT GOACCAGTOA 10020 
CCAAACCrCA TATGTGAAAT GGCCAAAGCA CATGCAOQCT CCTGGTTGTT CCTCTCAAAC 10080 
CTGTGCTGAC CSiAAGATTAG TAACXaOTTA TACCCAOTAT TTTGAGGTTT TATTGTTTTT 10140 
TTAATAACTA AAAAAAAACT OGTQCC 



Seg ID NO I 459 Protein sequence 
Protein AcceBBion #: MP_001990.1 



I 11 21 31 41 51 

I I I i t I 

MGRRRRIjCLQ LYFIiWLGCW LWAQC5TAGQP QPPPPKPPRP QPPPQQVRSA TAGSEGGFLA 60 

PEYREEGAAV ASRVRRRGQQ DVLRGPNVOS SRFHSYCCPG WKTLPGfflfQC IVPIC31KSC6 120 

DGPCSRPMMC TCSSGQZSST OGSKSIQQCS VRCMKGGTCA 0DHCQCX2KGY IGTyCGQPVC 180 

BNQOQNGGRC lAQPCACVYG FTGPQCERDY RTGPCPTQVN NQMCQ6QLTG IVCTKTLCCA 240 

TTGRAWGHPC EMCPAQPQPC WIGPIPNIRT GACQDVDBCQ AIPGICQQCaT CINTVGSFEC 300 

RCPAGHKOSE TTQKCEDIDB CSIIPQICET QECSNTVGSY PCVCPRGYVT STDGSRCIDQ 360 

RTGMCFSGLV NSRCAQELPG RMTKMQCCCE PORCWGIOTI PEACPVRGSE BYRRLCMDGL 420 

PMGGIP65AG SRPQGTGGNQ FAPS6NGN0Y GPGGTGFIPI PGGNGFSPGV GGAGVGAGGQ 480 

GPIITGLTXL NQTIDICKHH ANLCLNGRCI PTVSSYRCEC NMGYKQDANG DCIDVDBCTS 540 

NPCTNGDCVN TPGSYYCKCH AGFQRTPTKQ ACIDIDECIQ NGVLCKNGRC VNSDGSFQCI 600 

CNAGPELTTD GKNCVDHDBC TTTNMCLMGM CINEDGSPKC ICKPGPVI»AP NGRYCTDVDB 660 

CQTPGICMNG HCINSEGSFR CDCPPGLAVG HZXSCVCVDTH MRSTCYGGXK KSVCVRPFPG 730 

AVTKSECCCA NPDYGPGBPC QPCPAWISAE PHGUSSOVG ITVDORDINB C3aJ)PDrCAN 780 

GICENLRGSY Ra7C3fSGYEP DA5GRNCIDI DECLVNI^LC 0N6LCR11TPG 5YSCTCPP6Y 840 

VFRTBTETCB DXNECBSNPC VNGACRNNLG SFNCBCSPGS KLSSTGLICI DSLKGTCWU} 900 

IQDSRCEVNI NGATLKSBCC ATIiGAAHGSP CERCBLDTAC PRGLARIKGV TCEDVNECEV 960 

FPGVCPNGRC VNSXGSFHCB CPBGLTLDOT 6RVCLDIRNB QCYLKNDEDB CXHFVFGXFR 1020 

MDACCCAVGA AHGIECEECP KPGTKEYETl. CPRGAGFAMR GDVLTGRPFY KDIHECKAFP 1080 

©4CTYGKCRN TIGSFKCROJ SGPALDMBBR NCTDIDECRl SPDLOGSGIC VNTP6SPSCB 1140 

CFEGYESGFM MHKKC34DIDG CERNFLLCR6 GTCVNTE6SP QCDCPLQiEL SPSREDCVDI 1200 

liECSIiSDNLC 2{NGKCV]!IMIG TYQCSC3)PGy QATPDRQGCT DIDECMIMNG GCSTQCmSE 1260 

GSYECSCSEG YAIMPDSRSC ADIZSECBHiaP DICDQGQCTN IPGBYRCliCY DGPNASMDMR 1320 

TCIOVNECDL NSNICMFGEC ENTKQSFZCH OQLGYSVKKG TTGCTDVDEC BIGAHNCDMH 1380 

ASCUTIPGSF KCSCRBGWIO NOIKCIOIAS CSHGTHQCSI NAQCVNTPGS YRCACSBGFT 1440 

GDGFTCSDVD KCAE^7UILlCE KGQCLNVPGA YRCECEMGFT PA8DSRSCQD I0ECSFQ2IIC ISOO 

VSGTCMNLFG MFHCICDDGY ELDRTGONCT DZDBCADPIN CVN6LCVNTP GRYEQfCPPD 1560 

FQLNPTSVGC VDMSVOfCYL RF6FRGDG8L SCMTSIGVSV SRS8CCCSL0 KAHGNPCETC 1630 

PFVNSTEYYT IiCPGGBGFRP NPZTIZIiBDZ DBOQBLFGIiC QGGtiCZNTFG SFQCECPQGY 1680 

YLSEDTRICB DIDBCPAHPG VCGPGrCYNT LGNYTCICPP EYMQVNGGHN MMRKSFCY 1740 

RSYNGTTCEN BLPFNVTKRM CCCTYNVGKA GMKPCEPCPT PGTADPKTIC GN^ZPGPTFDZ 1800 

HZGKAVDIDE CKBIPGZCAN GVCINQI6SF RCECPTGFSY NDLLIiVCEDZ DBCSNGDNLC 1860 

QRNADCINSF GSYRGECAAO FKItSPNOACV DSNECLBZPN VCSHGLCVDL QGSYQCICHN 1920 

GFKASQDQTM CMDVDECERR PCGNGTCKNT VGSYNCLCYP GFELTHNNDC LDZOECSSFP 1980 

6QVCRNGRCF NEIGSFKCLC N03YELTPDG KNCZDTNECV ALPGSCSPGT CQNLEGSFRC 2040 

ZCPPGYBVKS ENCIDIKEC33 EDPNICLFGS CTNTPGGFQC LCPPGFVLSD NQRRCFDTRQ 2100 

SPCFTNFBMO KCSVPKAFNT TKAKCCCSKM PGBGWGDPCE U3»KDDBVAP QDLCPYGHGT 2160 

VPSLBDTRED VHBCLBSP6Z CSN6QCZNTD 6SFRGECPMG YHLDYTGVRC VDTDECSIGN 2220 

PCXaiGTCTMV IGSFECNCNE GFSPGPMKNC EDXNBCAQNP LLCALRCMNT F6SYBCTCPZ 2280 

GYALREDQKM CRDLDBCAEG LHDCBSRGMM CKNLIGTFHC ZCPPGHARRP DGEGCVCaTE 2340 

CRTKPGICEN GRCVNIIGSY RCECNEGPQS SSSGTECLDN RQGliCPAEVL QTZCQMASSS 2400 

RNLVTKSECC CDGGR6WGHQ CEI.CPLPGTA QYKKICPHGP GYTTOGRDID EOCVMPNLCr 2460 

NGQCZmMQS FRCFCKVQYT TDZSGTSCZD LDBC8QSPKP CNYICKISTEG SYQCSCPRGY 2520 

VLQEDGKTCK DUIBOQTKQH NCQFLCVNTL GGFTCKCPPO PTQBHTACID NNECGSQPLL 2580 

CGGKGZCQMT PGSFSCECQR GFSLDATGLN CEDVDECDGN HRCQHGOQNI L6GYRGGCPQ 2640 

GYIQHYQWNQ CVDENECSNP NACQSASCYN TLG9YKCACP SGPSFDQPSS ACHDVNECSS 2700 

SKNPCNYGCS NTB60YLOGC PPGYYRVGQG HCVSGMGFNK GQYIiSIiDTEV DSENALSPBA 2760 

CYECKINGYP KKDSRQKR8Z HEPDPTAVEQ ISIiBSVDMDS PVNNKFNLSK LGSKBHZLEL 2820 

RPAIQPIirami IRYVZSQGND DSVFRIHQRH GLSYIATAXK IOUMPGTYTLE ZTSZPIiYKKK 3BB0 
ELKKLEESllB DDYLLGBLGE ALRMRLQIQL Y 



Seq ZD NO: 460 DlIA sequence 

Nucleic Acid Accession ftt NM_013372.1 

Coding sequence 1 63.. 617 



1 11 21 31 41 51 

I I I i I I 

GCGGCCGCAC TCAGCGCCAC GCGTCJGAAAG CX3CAGGCCCC GAGGACCCGC CGCACTGACA 60 

GTATGAGCXX5 CACAGCCTAC AOGGTGGGftQ CCXTTGCTTCT CCTCTTGGGG ACCCTGCTGC 120 

C6GCTGCTGA AG6GAAAAA6 AAAG66TCCC AAGGT6CCAT CCCCC06CCA GACAAGGCXX 180 

AOCACAATGA CTCAGASCAG ACTCAGT06C COCAGCAGCC TGGCICCAGG AAC0GGG6GC 240 

GGG6CCM60 G0QGG6CACT GCCATGCCOG GGGAGGAGGT GCT66AGTCC AGOCAAGAGG 300 

CCCTGCAT6T QAOSGAGCGC AAATACCTGA AGOQAGACTG GTGCAAAACC CAGCCGCTTA 360 

AGCAGACCAT CCACGAGGAA GGCTGCAACA GTCGCACCAT CATCAACCGC TTCTGTTACG 420 

GCCAGTGCAA CTCTTTCTAC ATCCCCAGGC ACATCGGGAA GGAG6AAGGT TCCTTTCAGT 480 

CCTGCTCCTT CTGCAAGCCC AAGAAATTCA CTACCATGAT 6GTCACACTC AACTGGCCTG 540 

AACTACAGCC ACCTACCAAG AAGAA(S\GAO TCACA06TGT GAKBCAOTGT GGTTGCATAT 600 

CCATCGATTT GGATTAAGCC AAATCCAGGT GCACCCAGCA TGTCCTAGGA AT6CAGCCCC 660 

AGGAAGTCCC AGACCTAAAA CAACCAGATT CTTACTTGGC TTAAACCTAG AGGCCAGAAG 720 

AACCCCCAGC TGCCTCCTGG CAGGA6CCT6 CTTGTGCGTA GTTCGTGTGC ATGAGTGTGG 780 

ATGOGTGCCr 6TGGGTGTTT TCASACACCA 6AGAAAACAC AGTCTCTOCT AOAGAGCACT 840 

GCCTATTTTG TAAACATATC TGCTTTAATG 6GQATGTACC AGAAAGCCAC CTCACCCCGG 900 
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CTCACATCTA AAGGGGCGGG GCCGTGGTCT GGTTCTdACT TTGTOTTTTT GTGCCCTCCT 960 

GGGGACCAGA ATCTCCTTTC GQAATGAATO TPCATOQAAO AGGCTCCTCT GAGGOCAAGA 1020 

GACCTGTTTT AQTGCTGCAT TCGACATGOA AAAGTCCTTT TAACCTGTGC TTGCATCCTC 1080 

CTTTCCTCCT CCTCCTCACA ATCCATCTCT TCTTAAGTTG ATAGTGACTA TGTCA6TCTA 1140 

ATCTCTTGTT TGCCAAGGTT CCTAAATTAA TTCACTTAAC CATGATGCAA ATGTTTTTCA 1200 

TTTTGTC3AAG ACCCTCCAGA CTCTGGGAGA GGCTGGTGT6 GGCAAGGACA AGCAGGATAG 1260 

TGGAGTGAGA AAGGGAGG6T GGAGGGTGAO GCCAAATCAG GTCCAGCAAA AOTCAGTAGG 1320 

GACATTGCAG AAGCTTGAAA GGCCAATACC A6AACAC3U3G CTGATOCTTC T6A6AAAQTC 1380 

TTTTCCTAGT ATTTAACAGA ACCCAAGTGA ACAGAGGAGA AATGAGATTG CCAGAAAGTG 1440 

ATTAACTTTG GCCX3TTGCAA TCTGCTCAAA CCTAACACCA AACTGAAAAC ATAAATACTG 1500 

ACCACTCCTA TGTTCGGACC CAAGCAAGTT A6CTAAACCA AACCAACTCC TCTQCTTTGT 1560 

CCCTCAGGTG GAAAASAGAG GXASTTTAGA ACTCTCTOCA TAGQQGTGGa AATTAATCAA 1620 

AAACCKOUSA G6CTQAAATT CCTAATACCT TTCCTTTATC GTG3TTATA0 TCAGCTCATT 1680 

TCCATTCCAC TATTTCCCAT AATGCTTCTG AGA60CACTA ACTTGATTGA TAAAGATCCT 1740 

GCCTCTGCTG AGTGTACCTG ACAGTAAGTC TAAAGATGAR AGAGTTTAGG GACTACTCTG 1800 

TTTTAGCAAG ARATATTKT6 GGGOTCTTTT TGTTTTAACT ATT6TCA6GA GATTQGGCTA 1860 

RAGAGAAGAC GA0GAGA6TA AGGAAATAAA GGGRATTGCC TCTG GCTA GA GAGTAAGTTA 1920 

GGTGTTAATA CCTOGTAQAA ATGTAAGGGA TATGACCTCC CTTTCTTTAT GTGCTCACTG 1980 

AGGATCTGAG GGGACCCTGT TAQGAGAGCA TAGCATCATG ATGTATTAGC TGTTCATCTG 2040 

CTACTGQTTG GATQGACATA ACTATTGTAA CTATTCAGTA TTTACTGGTA GGCACTOTCC 2100 

TCTGATTAAA CTTGGCCTAC TGGCAATGGC TACTTAGGAT TGATCTAAGQ GCCAAAGTGC 2160 

AGOGTGGGTG AACTTTATTG TACTTTGGAT TTGGTTAACC TGTTTTCTTC AAGCCTGAGG 2220 

TTTTATATAC AAACTCCCTG AATACTCTTT TTGCCTTGTA TCTTCTCAGC CTCCTAGCCA 2280 

AGTCCTATCT AATATGGAAA ACAAACACTG CA6ACTTGAG ATTCAGTTGC CGATCAAGGC 2340 

TCTGGCATTC AGAGAACCCT TGCAACTCGA GAAGCTGTTT TTATTTCGTT TTTGTTTIGA 2400 

TCCAGTGCTC TCCCATCTAA CAACTAAACA GGAGCX3\TTT CAAQGCGGGA GATATTTTAA 2450 

ACACCCAAAA TOTTGGGTCT GATTTTCAAA CTTTTAAACT CACTACTQAT GATTCTCACG 2520 

CTAGQC3GAAT TTGTCCAAAC ACATAGTGTG TGT Q TT T TG T ATACACTGTA T6ACCCCACC 2580 

CCAAATCTTT GTATTGTCCA CATTCTCCAA CAATAAAGCA CAGAGTGGAT TTAATTAAGC 2640 

ACACAAATGC TAAGGCAGAA TTTTGAGGGT GGGAOAGAAG AAAAGGGAAA GAAGCTGAAA 2700 

ATCTAAAACC ACACCAGGGA GGAAAAATGA CATTCAGAAC CAGCAAACAC TGAATTTCTC 2760 

TTGTTOTTTT AACTCTGCCA CAAGAATGCA ATTTCGTTAA TGGAGATGAC TTAAGTTGGC 2820 

AGCaGTAATC TTCTTTTAGQ AGCTTQTACC ACAGTCTTGC ACATAAGTGC AGATTTGGCT 2880 

CAAGTAAAGA GAATTTCCTC AACACTAACT TCACTGGGAT AATCAGCAGC GTAACTACCC 2940 

TAAAAGCATA TCACTAGCCA AAGAGGGAAA TATCTGTTCT TCTTACTQTO GCTATATTAA 3000 

GACTAGTACA AATGTGGTGT GTCTTCCAAC TTTCATTGAA AATGCCATAT CTATACCATA 3060 

TTTTATTOQA GTCACTGATG ATGTAATGAT ATATTTTTTC ATTATTATAG TAGAATATTT 3120 

TTATGQCaU^G ATATTTGTGG TCTTGATCAT ACCTATTAAA ATAATGCCAA ACACCAAATA 3180 

TGAATTTTAT GATGTACACT TTGTGCTTGG CATTAAAAOA AAAAAACACA CATCCTGGAA 3240 

GTCTGTAAGT TGTTTTTTGT TACTGTAGGT CTTCAAAOTT AAGAOTGTAA GTGAAAAATC 3300 

TGGAGGAGAG GATAATTTCC ACTGTGTGGA ATGTGAATAG TTAAATGAAA AGTTATGGTT 3360 

ATTTAATGTA ATTATTACTT CAAATCCTTT GGTCACTGTG ATTTCAAGCA TGTTTTCTTT 3420 

TTCTCCTTTA TATGACTTTC TCTOAGTTGG GCAAAGAAGA AGCTGACACA CCGTATGTTG 3480 

TTAGAGTCTT TTATCTQGTC AGGGGAAACA AAATCTTGAC CCAGCTGAAC ATGTCTTCCT 3540 

GAGTCAGTGC CTGAATCTTT ATTTTTTAAA TTGAATOTTC CTTAAAGQTT AACATTTCTA 3600 

AAGCAATATT AAGAAAGACT TTAAATGTTA TTTTGGAAGA CTTACGATOC ATGTATACAA 3660 

ACGAATAGCA GATAATGATG ACTAGTTCAC ACATAAAGTC CTTTTAAGGA QAAAATCTAA 3720 

AATGAAAAGT GGATAAACAG AACATTTATA AGTGATCAGT TAATGCCTAA GAGTGAAAGT 3780 

AGTTCTATTG ACATTCCTCA AGATATTTAA TATCAACTGC ATTATGTATT ATGTCTGCTT 3840 

AAATCATTTA AAAA06GCAA AGAATTATAT AGACTATGAG GTAC CTTGC T GTGTAGGAGG 3900 

ATGAAAGGGG AGTTGATAGT CTCATAAAAC TAATTTOaCT TCAAGTTTCA TGAATCTGTA 3960 

ACTAGAATTT AATTTTCACC CCAATAATOT TCTATATAGC CTTTGCTAAA GAGCftACTAA 4020 
TAAATTAAAC CTATTCTTTC AAAAAAAAA 

Seq ZD NO: 461 Protein sequence 
Protein Accession ftt NP_037504.1 

1 11 21 31 41 SI 

ilSRTAYTVQA UiLLLGTLLP AABGKiaCQSQ GAIPPPDKAQ HNDSEQTQSP QQFGSRNRGR 60 

GQGRGTAMPG EEVLESSQEA LHVTBRKyiiK SDWCKTQPItR QTIH EEGC WS RTXINRFCVG 120 

QCHSEYIPRH IRKEEGSFQS CSPCKPKKPT TOMVTUICPE LQPPTKKKRV TRVKQCRCIS 180 
IDLD 

Seq ID 130 1 462 DHA sequence 

Nucleic Acid Accession Bos sequence 

Coding sequence: 1..2733 

1 11 21 31 41 51 

ATQAAAGTTG GAOTGCTGTQ GCTCATTTCT TTCTTCACCT TCACTGACGG CCACX3GTGGC 60 

TTCCTGGGQA AAAATGATGG CATCAAAACA AAAAAAGAAC TCATT6TGAA TAAGAAAAAA 120 

CATCTAGGCC CAGTCX3AAGA ATATCAGCTG CTGCTTCAGO TGACCTATAG A»,TTCCAAG 180 

GAGAAAAGAG ATTTGAQAAA TTTTCTGAAG CTCTTGAA6C CTCCATTATT ATGGTCACAT 240 

GGGCTAATTA GAATTATCAG AGCAAAGGCT ACCACAGACT GCAACAQCCT GAATGGAGTC 300 

CTGCAGTGTA CCTGTGAAGA CAGCTACACC TGGTTTCCTC CCTCATGCCT TGATCCCCAG 360 

AACTGCTACC TTCACACGGC TGGAGCACTC CCAAOCTGTG AATGTCATCT CAACAACCTC 420 

AGCCAGAGTG TCAATTTCTG TGAGAGAACA AAGATTTGGG GCACTTTCAA AATTAATGAA 480 

AGGTTTACAA ATGACCTTTT GAATTCATCT TCTGCTATAT ACTCCAAATA TGCAAATGGA 540 

ATT6AAATTC AACTTAAAAA AGCATATGAA AGAATTCAAG GTTTTGAGTC GGTTCAGGTC 600 

ACOCAATTTC GAAATGGAAG CATCGTTQCT GGGTATGAAG TTGTTGGCTC CAGCAGTGCA 660 

TCTGAACTGC TGTCAGCCAT TGAACATGTT GCCGAGAAGG CTAAGACAGC CCTTCACAA6 720 

CTGTTTCCAT TAQAAGAOSG CTCTTTCAGA GT6TTC3QSAA AAGCOCAGTG TAATGACATT 780 

GTCTTTGGAT TTGGGTCCAA GGATGATGAA TATACCCTGC CCTGCAGCSUS TG6CTACAGG 840 

GGAAACATCA CA6CCAAGTG. TGAGTCCTCT GGGTGGCAGG TCATOWSGGA GACTTGTGTG 900 

CTCTCTCTGC TTGAAGAACT GAACAAGAAT TTCAGTATGA TTGTAGGCAA TGCCACTGAG 960 
6CAGCTGTGT CATCCTTCGT GCAAAATCTT TCTGTCATCA TTOGGCAAAA CXICATCAACC 1020 
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ACAGTGGGGA ATCTGGCTTC GGTGGTGTOG ATTCTC^CA 
GCCAGCCATT TCAGGGTGTC CAATTCAACA ATGGAGQATG 
ATCCTTAATT CAGCCTCAGT AACCAACTGG ACAQTCTTAC 
AGCTCACGOT TACTAOAGAC ATTAGAAAAC ATCAGCACTC 
CCTCTQAATT TTTCTCGGAA ATTCATTGAC TGOAAAGGGA 
CTCAAAAOGG CrTTACAGCTA TCAGATTAAA ATGTGTCCCiC 
AGAGGCCX5TG TGTTAATTGG GTCAGACCAA TTCX3VGAGAT 
AGCATGGCCT CGTTGACTCT GGGGAACATT CTACCOGTTT 
GTCAATGGAC CTGTGATATC CAOGGTTATT CAAAACTATT 
TTTTTTTCCA AGATA6AGTC AAACCTGAGC CAGCCTCATT 
CATTTGCAGT G6AA06ATGC AGGCTGCCAC CTAGTOAATO 
TGCCAATGTA CTCACTTGAC CTCCTTCTCC ATATTQATGT 
ATCTTCCCCG TTGTAAAATG C3ATCACCTAT GTOGGACTGG 
ATTTTATGCC TGATCATCGA GGCTTTGTTT TGOAAGCAGA 
CACACAOSTC GTATTTGCAT GOTGAACATA 6CCCTGTCCC 
TTTATTGTT6 GT6CCACAGT GGACAOCAaS GTGAA CCCTT 
GTGTTCTTTA CACACTTCTT CTACCTCTCT TTGTTCTTCT 
CTGCTGGCTT ACCX3GATCAT CCTOGTGTTC CATCACATGG 
GTTGQATTTT GCCTGGOTTA TGGGTGCCCT CTCATTATAT 
A08CAACCTA GCAATACCTA CAAAAGGAAA GATGT CTQT T 
AOCAAACCSkC TCCTGGCTTT TQTTGTCCCT 6CACTGGCIA 
GTGGTGCT6C TAGTTCTCAC AAAGCTCTGG A6GCCGACT0 
GATGACAAGG CCACCATCAT CCGCGTGGGG AAGAGCCTCC 
GGGCTCACCr GGGGCTTTGG AATAGGAACA ATAGTGGACA 
GTTATTTTTG CTTTACTCAA TGCATTCCAO GGATTTTTTA 
TTGOACABTA AGCTOCQAC3V ACTT C T Q TTC AACAAGTTGT 
CAAACAGAAA AGCAAAACTC ATCAQATTTA TCTGCCAAAC 
AACCCACTGC AAAA(3UiAGG CCATTATGCA TTTTCTCATA 
ATCATGCTAA CTCAGTTTGT CTCAAATGAA TAA 

Seq ID MO I 463 Protein sequence 
Protein Accession # t Gos sequence 



MKVGVLSffLIS 
EKRDIiRNFLK 
NCYLHTAGAL 
IBIQLKKAYE 
LPPLEDGSFR 
LSLIiGELNKN 
ASHFRVSHST 
PLNFSRKFID 
SMASLTLQNI 
HIiQHNDAGCH 
ILCLIIEAIiF 
VFFTBFFYIiS 
TQPSNTYKUK 
DDXATIIRV6 
LDSKI1HQI1I.F 
IMLTQFVSNE 



11 
I 

FFTFTDGHGG 
LLKPPLLWSH 
PSCECUIiHNL 
RIQGPESVQV 
VFGKAQCNDI 
FSMIVGNATE 
MEDVISIADN 
HK6IPVNKSQ 
LFVSKHGMAQ 
LVNBTQDIVT 
WKQIKKSQTS 
LFFNMI1ML6I 
DVCHLNWSNG 
K6LLILTPLL 
NKIiSALSSWK 



21 
I 

FliGKMDGIKT 
GLIRIISAKA 
8QSVNFCERT 
TQPRNGSIVA 
VFGFGSKDDE 
AAVSSFVOmj 
ILNSASVTKH 
LKRGYSYQIK 
VNGPVISTVI 
CQCTHLTSFS 
HTRRICMVNI 
LIAYRIIIiVP 
8KPX1IAFWP 
GLTKGFGICrr 
QTEKQNSSDL 



31 
I 

KKBLIVNKKK 
TTDC2fSLNGV 
KIHQTFKINB 
GYEWGSSSA 
YTLPCSSGYR 
SVIISQNPST 
TVLLREEKYA 
NCFQHTSIPI 
QNYSIKBVFL 
ILMSPFVPST 
AL8LLIADVH 
HHHAQHLMMA 
ALAIVAVHFV 

vmaomMm 

SAKPKF5KPF 



ATATTTCATC 

TCATCAGTAT 
TGCGGGAAGA 
TG6TGCCTCC 
TTCCAGTGAA 
AAAATACATC 
CCXTTTCCAGA 
CCAAAAATGG 
CCATAAATGA 
OTOTGrTTTG 
AAACTCAAGA 
CACCTTTTGT 
GTATCTCCAT 
TTAAAAAAAG 
TCTTGATTGC 
CTGGAGTCTG 
GGAT6CTCAT 
CCCAGCATTT 
CTGTCATTAC 
G6CTTAACTG 
TTGTGGCTGT 
TTGGGGAAAG 
TCATTCTGAC 
QCCA6AATCT 
TCTTATGCTT 
CTGCCTTAAG 
CXAAATTCTC 
CTGGAGATTC 



41 
I 

HLGPVBEYQL 
LQCTCEDSYT 
RFTNDLLNSS 
SELLSAIBHV 
GNITAKCBSS 
TVGNLASWS 
SSRLIiBTZtEN 
RGRVIiIGSDQ 



8eq ZD NO I 464 DMA sequence 

Nucleic Acid Accession #: AB035089.1 

Coding sequence: 9845.. 10219 



GGGCAT6CA0 
CAGTTCTAGT 
CCAAGAGGAA 
TTGGTTTGAA 
AOGAGGAAGG 
GAGACTATTT 
AAGAAAGGAA 
GTAGACAGAA 
AATCTCCTCC 
AATAAAATOT 
AA6GAAGT6A 
GTAATGACAG 
GGATTCCTGG 
CAAGTGTTCA 
TCAOGATATG 
TAGGAGAACT 
CAAGCTTTCT 
TTCAACCTTC 
CCATAGATTG 
CCrCCATTCC 
TGGTACCCGA 
C7GGATGCAG 
TTTCCTCTTC 
CTATTCCCCG 
ATATATATTG 
GTGAGAAATT 
CCTATGTGTT 
AAAACAAACT 
TGTGOAGAAA 



11 
I 

CCATC3GGGGA 
AAAAGGGAGA 
TTAGGGAGAG 
AGCATACAGT 
GA66CAAGA6 
COCTCTCTGC 
AGCTAGTTAO 
TCCTTGGGAA 
ACTAACCAGT 
TCTCTTGACT 
6GTCCAGGAA 
GATATTTCCT 
AGCCAATGAA 
TATGCAAAAA 
TCCAGTCTCA 
ATTTAGGAAC 
CTAAATTTAA 
AGGGCAAACC 
GTCCCCTGTA 
CAGGATGA6C 
GTAAATCCAT 
ACTCAGCT6A 
TTTCTTTTTC 
ACtTCAATCA 
ATACAGGAGA 
TCCAGATTGG 
TCXGGCACCT 
CA06GCTQGT 
TOUZAACTCT 



21 
I 

AAATCCATAG 
ACATCAATAT 
AGTTATAAGA 
AAATATGATQ 
ATAATATCAT 
TTTTCAAACC 
TCTTGTTCTO 
TACAGTAATT 
TTCCCTATAG 
TTGTTACTTA 
AATCTAGGAG 
GAAAGTGTAA 
GTTCGTGTAT 
CTTCTTGGAA 
CACACXZAGGA 
AGAAAAAAAT 
GCAAACTCCT 
TCCGTGCCTC 
ACCCOGGTTT 
TTGTTGCTTC 
CCTACTCCAA 
GAAOACCATT 
TTCTATCTTT 
OGGAACTTAT 
CCTAAGAAGA 
AAACACAGCT 
TGTTGTAGAT 
GTTAAAAAGO 
ATTCAOQGTC 



31 
I 

TGCAGATAAA 
AGGATGTTTC 
GATCAGCAAG 
TCTQTCOCTQ 
TTTCTCTGTG 
TTACTGQA6T 
AG6TTGTTCA 
GACATATATT 
A7TGCCACAA 
ACAATGCTGA 
ATATTTCTTA 
TTTCCCATTG 
GTTTATGAAA 
TTTCTGAQTT 
TATGTCCTTT 
GCCTG AAATG 
GGTCATTTTC 
AQACGTTTA6 
GTCTCAGCTT 
TGTCCTATGA 
TAGAGGAAGG 
ATTCATTTTT 
GGATTTTTAO 
AOCTCTTAAA 
GCATGTCTTQ 
TCCTTTCTCC 
AAATCTCCCT 
GCGCATGACA 
GGTTGGAATO 



IFPWKWITY 
PIVGATVDTT 
VGFCLGYGCP 
WLLVLTKLH 
VIFAUMAFQ 
NPLQNKGHYA 



41 

1 

GCAAOGAGGA 
TTAGCAATAG 
GGGACAGGGT 
OCAGTGTTGG 
CTCCAACTGT 
TGTTTTCCCT 
ATGTATACAT 
CTGTTATTTG 
GCACATAATA 
GAAAACTTTA 
ACCAATCTAT 
AGGATTTGTT 
TATCAAGAGA 
CTCTGTGGCA 
CTAGCCTGTC 
ATTTCTCATT 
AQTTAGTACC 
CC3VIAGTCT6 
6TTATCCTGT 
GACATTAGAT 
TCCATTTTTG 
GGAATTCTTT 
TCCATCAAC8 
CTCATTCAGA 
GQGGTTGAGG 
CATCCAGCCC 
TGACTTTGTG 
ATAGCAMSra 
CACACTTGIG 



TCTGTCACTO 
AGCTGACAAT 
AAAGTATGCC 
GACA6CTCTT 
CAAAAGCCAA 
TATTCCCATC 
AACTATTATC 
AAATGCTCAG 
AOTTTTCCTA 
GQATTTCW5T 
CAT06T6A06 
CCCCTCTACA 
TGGAAGTCTC 
CCAAACCTCT 
TGAT6TCTGG 
CACAGCTGCT 
GCTTGGCATC 
GATGATGGCT 
CATT6CTGTC 
GTCCAATGGA 
GAACTTCGTT 
ACTGAGTCX3G 
CCCTCTGCTA 
GGCTTGGCAT 
TGGAATACTC 
TTCTTGGAAG 
AAAGCCTTTC 
CTCOGACAAC 



51 

1 

LLQVTYRDSK 
HFPPSCLDFQ 
SAIYSKXAMG 
AEKAKTALHK 
GWQVIRBTCV 
ILSNISSLSL 
ISTIiVPPTAL 
FQRSLPBTII 
QPHCVFKDPS 
VGLGISIGSL 
VNPSGVCTAA 
LIISVZTIAV 
KBTVQESI18R 
GPFZLCFGIL 



TAGATTTGGT 
CAGAGTAGGA 
ACTTACATAT 
CATGAAAACC 
ATCTATATCT 

atgcttgaaa 
a6aaacaata 
cagccttcat 
aaaggcatta 

TTTAATTTCT 
CATAAGTTGG 
ATATAT6ACA 
TATCACATGC 
TGAACTCATC 
TTTCCTTAAG 
AAATTCTCTT 

TCCTTTTCTT 
TCTTATAGOQ 
ATCTCAGATA 
COCCATTAGT 
OACTCAAAAC 
AAACA6GCAG 
CTACTTTCAG 
ATGTGCIGAG 
TTGGGGAGAA 
CAGAATTCTA 



lOBO 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
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60 
120 
180 
240 
300 
360 
420 
460 
540 
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5 
10 
15 
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TOGAGAAGAO 
TTTCATTCAT 
ATGAAAG6TC 
ATCAHCTTAT 
AAAAGGAAT6 
TTAAAGAAGC 
ATAGGCAAAT 
A6AA.TGAAGT 
GATGATAGCS^ 
TATATAAATT 
TGTCAGGGA6 
ATTCTTCAGA 
AAACAQAA6G 
CTTCATTTTT 
■ GCTCTTCTCA 
AAACATACCT 
GG66A66CCT 
AAGATGAGGA 
ATTAACTATC 
TGGGTCCCTC 
GG6AACACAG 
AACAGAAAAC 
CAAAGCTGAG 
ACTGAAGTAC 
TACAGAAAAA 
GAAAT6TAAG 
CftAGAAAAGA 
AGAATACAAA 
CATTGGCATT 
TTCTTATTTT 
TTTGTCATAA 
TQAlOTAACTT 
06ATCTCACC 
GGAACACAGG 
AATAATGATA 
TATTCTGCCA 
CTTAGTTCCA 
QAGAGTCTT6 
AGGTCTCCTG 
GAAAGAAAGA 
TAAGCCTTCT 
COTTAAAAAT 
TTTAAGAAAA 
TGATGCTTTT 
CAACACCAAG 
CTTCTATTCC 
CAACACTOCA 
T7TTTCTCTG 
AGCACAGGGG 
TTGAGTAAAG 
CACAGAGAAC 
CAGCTTTA6A 
AGGCCAGTTT 
TTCATGGCAC 
CCTCAOAGCA 
AAATGATGAA 
CTAAGGGTAA 
GCXOT3CCCA 
TAATCATAAT 
ATAAGCACAA 
CCTTAATAGT 
GTTCAAAACT 
TGAGCTTTCr 
TTAGCCATGG 
AGCCACS^T 
CCTGGCATAA 
AGAAGTGAAA 
AACCCTTTAA 
AAGAGAAAAG 
GGTGQATTTT 
GTCCTCAATG 
ATTTACATTQ 
AGGAAATGTT 
ATATQAGCTG 
AATTTCACCT 
AAATGGAAGA 
GTGCATTAGC 
ATGACTGAAG 
T6AGTAAGAG 
CGATACATCT 
GT6AGTTCCT 
CAAAAACAAA 
GAATCTCCTT 
CTTTTGCCAA 
TTTTGCATTG 
ATTCAGGGTC 
TACATAAATT 



TCTOGCATTT 
AGGTTTATAC 
ACAACATCAT 
GATTAAAGAA 
ATGTACTGAT 
CAGTCACAAA 
CCATAGAAAC 
ACAAGATTTC 
CAACTTTOTG 
ATATGTTAAT 
ATATTGGATT 
TTACAAGATA 
ACCATTGAGA 
AAATGTAAAA 
6TGTCAGCCT 
GAGACTGGCA 
CAGAATCACA 
AGAAGCAAAA 
ATGAGAATAG 
CAATAACATO 
CCAAACCATA 
CATCTGGGAT 
CACTCAGGA6 
ACTTCTTCAC 
TTACTAAGGA 
CTTTTTAGTT 
ATGGTGGGGT 
GGGATGGAGT 
AGAAAGGCAC 
TTCCCTATAG 
ATTTTCATAT 
CAAATACTAA 
CTTCATTCCA 
TAAGAGCTTC 
CTGTGTTCTA 
6AGCAAAATT 
AOTAAACCAA 
AA6GAGATGT 
TTAGCATTCA 
TAAAGA6GGT 
AATTAGGACA 
AAGAGAAAAA 
TT6TACTACA 
CCAGGAGTTC 
TTCATGTTCG 
CCTATCA6CA 
CAACAAATTA 
GTTCC6TCGG 
6CTGT6CAGG 
TTTTCTTGTC 
ACCACAGAAA 
TCCCTQAACA 
TAGGGAAAAT 
ATAATTATTA 
TA6GTCTGGA 
TAAGAGCAGG 
GAAAAAATCT 
TAAAAGTCAT 
AATGTGAAAA 
GTCCAAGTAT 
GCAAAAT6CT 
TGGCACTTCT 
TQTGTTCATC 
ACCTGGCAtA 
GAOTCAATOT 
TCTATTTAAA 
TATGGTCCTT 
ACCAACAAGG 
ATTGCAQTAC 
GAAGOaTTATT 
AGACTACCA6 
ATACAGCAAT 
CATCACCAGT 
AAOATGGCCA 
GGOCTACCCA 
AAGCAAGGCA 
TCCATTTCCA 
TGTGGATTTA 
TAATAAGTAA 
TGGTGGGAAA 
GA6AATAGTT 
GATCAGAAAC 
CAGGGAAGCC 
GATQATATTG 
CCTACATGAC 
TGGATTCAGC 
ACTCCTAATT 



CCTCAAAATG 
TCAAAAGAAA 
TATTCATAAT 
AATCTGGTCT 
OCATOCAATG 
AGGACTTACT 
AGGAG6TA6A 
TTTTGGAGGT 
AATATAATAA 
AAAAAGGGGG 
AAATGGCCTT 
TTCCAGGGGA 
AATGTTGTGA 
TTAGAAAGCT 
GTTAACTCAA 
A6AAAAAGAG 
GTAG6AGGCA 
GAAGAAACCC 
CACAA6AAAG 
TGGAAATTCT 
TCACTCAGCA 
GGTTGTAAGG 
AAGGCAATAG 
TATCTCTTTG 
AATTCATAGG 
CTTTGGTATT 
TTOT 



TQAAACAAAT 
CTACATGTAT 
ATTAAAAAGG 
TCATAAAGGT 
CCACA6AGG0 
CAGACACACA 
AAGCCrCTCC 
TATCATGCAT 
AAAATACCTA 
GGCACTTTTA 
CAG0GA06CA 
TTGTAAAOCC 
OGATTACTTA 
TTAATATATT 
CTTTAAAT6T 
AAATACGATT 
CA6ATCACAT 
ATCTGTTCCA 
TCACATCAGC 
QCAAGGTAGC 
CTAGCAG6CA 
AATTCCCATA 
COGCTTCATG 
AAGCTGCAAC 
GGTCATAGTT 
CTT6GACACA 
TTCCTCATTT 
TCAG6ATAG6 
ACACAACTGC 
GACTCAATAC 
AATAAATGTT 
TAATTTAATT 
ATTTT6GAAA 
TTGCTGGAAG 
AATTTATGTC 
TGAATTGAAC 
CACTCTTCTT 

qtcctacaa0 
atatccaacc 
gcccataa6g 
agaaaatcta 
gttagagcaa 
tgaggt6aaa 
catttaggqa 
tgaat6atct 
ttcaaaagct 

ACAAGCTCTT 
CATTTCATTT 
6A1GAGCCTG 
CAACTCTCCC 
GGGATAATCT 
TAAGATACCA 
TOTATGACTA 
GA6GAAGTAC 
ATCATGGTTA 
TGCTCTGCAG 
ACCTTCAGT6 
ACCTGTATAA 
TTACTGTTGT 
CCTACrrCTT 



TTAACCTQGA 
TGAAGAAATA 
AGTAAAAGGA 
ATTCATAGAA 
ATGTGGACAA 
GTATGATTCC 
TTCCTGGTTT 
AGTGAAATTG 
AATCATTGAA 
TCCaCAAAAC 
GGACAACAAC 
AACACTG6AA 
TCCTGACA66 
GCCATTTAAA 
TGTGTTAGTC 
GTTTAATTGG 
AAAGTTATTC 
CT6ATAAACX! 
ACOGGOCCCC 
GGTAGATACA 
AGGCA6ATAA 
OGCACAGGAA 
AATCCTATTC 
GACTTAGAAT 
ATGACAAAAA 
CGAAQTATGC 

GAOAGQAAAT 

TTCACATGAG 
AGGTACAATG 
GAGTGTTAGC 
AAAGGCA6CA 
CA6CCTCTCT 
AGCTTAATAA 
CTCCTGCATT 
TTTCATCTGA 
aOAACACAGA 
TCTTAACAOC 
ATCCTACCTA 
TTTACAATAG 
TAATATATGC 
CAAAATCTCA 
CCATTTATTA 
GGA6TTCACC 
ACAGTTCAGA 
ATTAGGGATG 
TATCAGCATC 
GATGGTAATA 
ACTGTGAGAG 
TCTCTTCCAO 
ATATCATGTG 
TAAACCTGGA 
AAGATTGAGA 
CTGGGTTACT 
CTGQGTTCAG 
TCS66AGTCGC 
ATGCAAATAC 
ATTATTATTA 

ttcattgagt 

AT<SITTGCTA 

gtagaaagtt 
tctataaaca 
taaagactta 
acgtgcaga6 
ataata6cac 
ttcaacatac 

AGCT6AGAGT 
CTQGTAGACA 
GAAGAATTTT 
TACACCAATT 
CTGATCTAAC 
CCTTTTTTGA 
TCTGACTGAA 
CGGAGAAAA6 
GCATCCT6AT 
GCCX3ACCCAG 
CCACTQGAGT 
TGTGATAAAA 
TC6ATAAACT 
ATGGGATATT 
CAGGAAATAT 
AAATTACTGG 
TTTGCAAACC 
ACCTCTTTCT 
AAATATCCAT 
TACAAATAAO 
CCTTCATATC 



TTTACCATAT 
TGCCATGCAA 
TGGAAACAAC 
TGGAATATTA 
ACCATGAAAA 
ATTTACCTGA 
CCAGGGTCTC 
TTGTGGAATG 
TTGTACAGTT 
AAACAGCCCC 
CCCTCTCCCT 
TGAGTCTGAA 
TCAAGCAATT 
ATGGCCCGTC 
TGTTTTCATG 
GCTTAGAGTT 
TTACATGGTG 
CAT066ATCT 
ATGATTCAAT 
ATTCAAGTTG 
CTTTCTCACT 
GTGACTOGTA 
TCCATAGTAT 
TAGCACTACA 
CTTTCAGAAC 
CTAAAAGACA 
TTITGTTTTA 
TGCAAfTCTA 
CCGGTGACTG 
GTAGAACTGT 
CCX5CTTGTGA 
AGAS6AGAGG 
GCCCACCTCT 
CATQAATTAT 
CrGTCTGATT 
TTTGTCCTTT 
G6GAGAGTGC 
TGGTTGGATG 
GCTCTAGTGT 
TCTTTAAAAA 
AGATTGTAGA 
CAACCCAGAT 
AAGTCATTCT 
ATGAATTCAC 
AAATCAAAAG 
GTCCTCTTAfl 
ATTAOGTTGT 
GATGTGGTGG 
CACTGACTTA 
OTTCTTCACT 
AGTCACAGAG 
ACTTCACAAA 
CATACA6AGT 
AAAA6ACAGT 
ACTCGAGCTT 
AGTGAOCTCA 
ATGCSUU^TGT 
TAAAGTAGCT 
CATTAATGAG 
IGQAATATAT 
CTAGATTTAA 
GGGTTTTTTT 
GAQTTACCCA 
AAXGAGCATC 
CAAC3U3GTAT 
TCOIATOCTT 
TTAACTGGGA 
GCGCTGCATC 
CTGGAAGAAG 
ATCAGGGAAT 
AGACTTAGCA 
TGTTTGAAGG 
TTCAACAAAT 
ACX5TATCAAT 
GTCTGTGTCT 
GTGGAGAGCA 
GTCCC AGACC 
GAGGAGOTTG 
GGCACTGACT 
ATT6GAATGG 
TGAATGCACA 
AGA6AA6TCT 
ACA6CCTCTT 
TGTGCCaGOC 
GGACAGGAGA 
TAAGTTTGGT 
TCAAAGGAAT 



GACCCA6C6A 
AAAAATGTAC 
ACAAATGTCC 
TTCGACCACA 
TAACACTAGA 
AATGTTTGGA 
CA6GAAGGGA 
AGATCAT6AT 
GAATTTATGG 
CCACTCTGGT 
GGCCACAGAC 
GCCAGQTGCT 
TATTTTTCXSG 
T6TTTCAATT 
CTGCTGATAA 
CCACGTGATT 
GCT6CAAGAG 
CCTGA6GCTT 
TACCTCTACC 
AGATTTGGGT 
GAGCCTATGC 
G6ATCACTGC 
GCTATAA6AT 
TTCCTTGTTA 
TGAAAAACAG 
ATGCAAAATC 
CAGCTG6AGT 
lACTTATTCT 
CTGACTTGCIA 
AATCCTGTCC 
AATCTGAAGT 
CATAAATTTA 
GCTTCCTCTA 
TTTTGAGAAT 
ATATTTTACT 
ATCTAAATT6 
CTTGCAGCCA 
TGATCCACAG 
AACCAGCAAT. 
CGTAGTTTTG 
AAGATTGAAG 
ATATCATTTC 
GACAGGAATC 
TCAGtGAAGC 
AOAACAACAT 
GAG€CAAAGA 
CCTGTTGCAG 
TCTGATGGGT 
AACA0ATCTT 
TTGATCAAGT 
CACTCTGATT 
AACTAAGAAA 
GGGTTGGCAT 
CAGCACTGTA 
TGCTCTTCAC 
TGGCAGAAAA 
TTACAACAGT 
ATAATTATAC 
ATTCAGAGGA 
TGGTTTAGAG 
ACAGGCTTAG 
CCCCATTCTC 
TGTAAAGTCC 
ATGAGGAAAG 
AACAGGGCTT 
6ATGACTGTT 
AGCTAAACCT 
TTTAGTTCflfl 
TCAAATATAA 
AACATCAAAG 
TGGGTTTAGT 
TTGATAGGTC 
CCACTGATGC 
TTTTACAGGT 
CTGAGTGGCC 
TTTACTCAGA 
CCAACGATAC 
TGTAATAGAG 

cagtcacata 
gcaggcttgg 
ggatgaaaga 
gagaagcaat 
ctgcttctgc 
gacattcccx: 
tactgcatct 
aatatatagt 
atttaoatgc 
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CATCAAGAAA TTTTACXSiaA CCAGTOTGOA ATCTACTOAT TTTGCAAATG CTCCAGAAOA 7020* 
AAGTC6AAA0 AA(3ATTAACT CCTGGGTGGA AAGTCAAACG AATGGTAGGA GAGCCACCCA 7080 
TTATA6AAAC ACCTTTGAGA AACCTATGCC AGTGAGCCTT GTGCTTGACA CTGCATGGGQ 7140 
GAACAQ6TGT GQGGATTGAG ATGGGTTTGC AGG6AGGGCT GAAGAGGOCA CTCCAOATGA 7200 
AaOATTTGTC CAAATGAATA TOAAGAGAGC CTAGGQGAGC CAAGQAQOAA ATCACAGGAA 7260 
GCCAATTA6A TGGmCACA TCT6GA6AAT TATTT QCn ' A TG6CCCT6CA T6ACAATAGC 7320 
TTTGTGGATC CCCTGTCTCC GCTCAGACCT ATTTTQAGAT CATATCCTTT ACTTTAAATC 7380 
AGACTCAAAT TTTTATGATG AATATTTAAT AGAAAACATT AGAAAGCGTC TCTCGTCTCC 7440 
TTTACTAATT GGGAAACAAG CAGCTCTCTG GTAAATCACC CTTTTGTCTC TGAGCTGQAG 7S00 
CTGCXrrGGAT CACATCTOTA GCCAATGTOT TCTGCAGGOA TTATCACAGC TCTCITCCOC 7560 
ATCAAGGGCA AAOAOCTTOA C3UUIGTCTCC ATTCTACAOA CATCTTTCTT ACCTCOCACC 7620 
TCTCATTACA GGCCAAACTT ACAGCAACTC AACATGAGAG TGAATAQGAA GATACCCCCG 7680 
GAAGTAGTGT CTGACAGCAC AGGACATGCG TTTCATATTA CAGAGCTCAA GTCACTCATC 7740 
CTAAAATGCA ATCAGGGCCT CCTTCCTCTG AATGG6GACC CCGTAGTTAA AAAAAAATAA 7600 
AAOTAOGAAG AGGAGG6AG6 GA6AAAGGAA AGACACATGT TGOAAOAOTA GACAAAATCA 7860 
6TTTATCA6T ATTCCAAATC AGATGATT66 AGACATTCAT ACACAGAGAA OGTGAACTCC 7920 
TTCTCTATCA CAAGAAGTGA TGTCTCCATC AAGGGTAACT TTATAOGACT GGAGCCTTGA 7980 
AGAAAGCTGC ATCTGGTGAA CCACTOGTCa GTGAGTCTAA CAATTCAAAG ATCAAAGTCA 8040 
GTGAGTCTCA A6CAGGGATT TGGGTCAATA ATTAACGATC AGTCAC3GAAC ATTT6CAAAG 8100 
CATCTTCCAG ACAA6CCATT TQTAGCT1GT GTAAAAQACT CTTTTATTCT TTCCCTTGCA 8160 
GAAAAAATTA AAAACCTATT TCCTGATGOO ACTATTGGCA ATGATAOGAC ACTOGlTCrT 8220 
GTGAAOGCAA TCTATTTCAA AGGGCA6TGG GAGAATAAAT TTAAAAAAGA AAACACTAAA 8280 
GAGGAAAAAT TTTQGCCAAA CAAQGTATTG TCTATATTTT ATTTATATAG TGTAATATGT 8340 
TAATACATGG AAT6TTAAAC ATTTCTGATG GAATGTAACA T6ATAA0TAA AAAATAAAAA 8400 
TTOTTCATOT CTGTTATTTT 6TT6TTTTAC TCTTATAACT TTATTTAGTT AGGAATACCT 8460 
QAAAAACTAT TGTTTCTAAC TCATGGAATT CCTG6GTTAT TTCTTAGAAG AAGAAGGAT6 8520 
TGTTGCTATC TCAATAATAT TATCTTTTTT GTCTTGTQTT TCAOGTGTTA TTTGTTGGAC 8580 
ACATTGATTT ATTGCAGAAT ACATACAAAT CTQTACAGAT GATGAGGCAA TACSU^TTCCT 8640 
TTAATTTTGC CTTGCTGGAG GATGTACAGG CCAAGGTCCT GGAAATACCA TACAAAGGCA 8700 
AAGATCTAAG CATQATTGTO CTGCTGOCAA ATGAAATOOA TGGTCTGCAG AAG GTAA GAA 8760 
CTTGCATCTA CAACTCTTCC TTCTACTOCC GGACATTTTT CCAAAGATAC CAAGTTTAAA 8820 
C3VAGGTAAAA GCTTATGACC GAGTTOOCTC AAAATGATQA AAAATTCTAA ATGAGGAATG 8880 
ATGACTCACC TTCATATTAC AAATATTTGA GCATAGGGCC TGACACAAAC TQAAAGCTTA 8940 
GTTTTTGTTT GTTTQTTTGT TTTTATTATT ATTATTATAA TACTTTAAGC TTTAGGGTAC 9000 
ATQTGCAGAA TGTaCAQQTT AGTTACATAT GTATACATGT GCCATGCTGG TGTGCTGCAC 9060 
OCATTAACTC ATCATTTAGC GTTAGGTATA TCTCCTAATG CTATCCCTCC CCCCTCCCCC 9120 
CACCCCACAA CAGTCCTCAG AGTGTGATGT TACCTTCCTG TGTCCAA6TG TTCTCATTGT 9160 
TCAATTOCCA TCTATGATTT AATTCCATCT ATGGCTTAGT TAAXGATTAA TTTAXTAGAG 9240 
TTACATGCAT TGGATATCAA TTTGATQATA TTATTATGCA GCAATTTAAA CTTX3ACTGGG 9300 
AGAAATATAT ACCAATGTGA GQAAAGTTTA CAAATAGGCC GAGTAGAAAA GG6AATACAA 9360 
ATTTAGGAAT TTAGGGAATT ACAATTTAAT AATTGCAATG TGTACTAAAT AATGTATACA 9420 
GAAAAATATG ATGAGCCTAT TAAAAATTGA CACAT6TAGT AG GCTGTT QQ CACAAQAAAT 9480 
AGTQATACAT ACAGTTCATT GTGTACAAAA TAATOTAATC ATATTTTACA TOTO TATCAT 9540 
ACAQTTGTAT ACATACATAT GTACACATAT ACATATACGT AAAAACATGA TTCTGTTTTT 9600 
ACATACATGT ATATACATAT ACACATATAA CCCAATGTAT TTATATATTC AGQACTCATA 9660 
TTTTACCTAT TAGAATAATA ATGTCTATTA AAGTGAACCT TCTGTATTTC ACATTTATTG 9720 
CCAAAATAAC GAATCTCCAC ATAGTCAATT GATTGTTAAG 6TGTATTAGA GATCGACAGT 9780 
TAGTCATATC AGTTTCTTTT TTCCATTTGT ATAGCTTQAA GAQAAACTCA CTGCTGAGAA 9840 
ATTGATG6AA TQGACAAGTT TGCAGAATAT GAGAOAQACA TGTGTCGATT TACACTTACC 9900 
TOGGTTCAAA ATGGAAGAGA GCTATGACCT CAAGQACACG TTGAGAACCA TGGGAATGGT 9960 
GAATATCTTC AATGGQQATG CAGACCTCTC AGGCAT6ACC TGGAGCCACG GTCTCTCAGT 10020 
ATCTAAAOTC CTACAGAAGO CCTTTGTGGA GGTCACTGAG 6AGG6AGTGG AAGCT6CA6C 10080 
TGCCACOOCT GTAOTAGTAG TGQAATTAIC ATCTCCTTCA ACTAATGAA6 AgiTCTGTTO 10140 
TAATCACCCT TTCCtATTCT TCATAAGGCA AAATAAGACC AACA6CATCC TCTTCTATGG 10200 
CAGATTCTCA TCCCCATAGA TGCAATTAGT CTOTCACTCC ATTTAGAAAA TGTTC ACCTA 10260 
GAGGTGTTCT G6TAAACTGA TTGCTGGCAA CAACAGATTC TCTTGGCTCA TATTTCTTTT 10320 
CTATCTCATC TTGATGATGA TA6TCATCAT CAAGAATTTA AT6ATTAAAA TAGCATGCCT 10380 
TTCTCTCTTT CTCTTAATAA GCCCACATAT AAATGTACIT TTCCnCCAG AAAAATTTGC 10440 
CTT6AG6AAA AATGTCCAAO ATAAGATGAA TCATTTAATA C08TGTCTTC TAAATTTGAA 10500 
ATATAATTCT GTTTCTGACC TGTTTTAAAT QAACCAAACC AAATCATACT TTCTCTTCAA 10560 
ATTTAGCAAC CTAGAAACAC ACATTTCTTT GAATTTAGGT GATACXTTAAA TCCTTCTTAT 10620 
GTTTCTAAAT TTTGTGATTC TATAAAACAC ATCATCAATA AAATAATGAC ATAAAATCAT 10680 
TTTTGCTTTA CCTOTTTTCT CTCTGQAAAG GGCAAGTGTC CAOTTACACA TAGGAAAGAT 10740 
AATTTAGAGA TATATTAATC ATATATAAAG GAAAATTAAA AACAGAQTAG TTCATSVXGA 10800 
GCX:TGGAGTA GAAQGCATAT CCCAGAACAG GAGGAGCCTT GTAAACCACA TAGGAACrrC 10860 
CTATTTTATG CTAAAGGGAT AAGAAACTCA TTACAGGCTT TGATGGTTGT TTGTCAAAGA 10920 
QGGGCATAAA ATTATCATAT CCACATCTAG AAAATACATC TCTGGCTAOG CTGATATCAA 10980 
T6QATG0GAG GAAASAACAG TGT06TTACC ATATATAAAT TAOQAAATCA TTAGAGTATT 11040 
G6GA0TGGAA ATGGAGAGAA AGAAAGAGCC TG6GG6AATT ATTTAGQAAA TAATAGTTAC 11100 
AGAAAGACAT CTAAGITGCT GACCTATCTQ ACTGGATGGA TGGAAGAATA TCTTGTTTCT 11160 
GAGAGAAAAA AAGACTTTGG GTTTAAATTT GTACTTGATG AATTAAGGTA CTTTTAATAT 11220 
TCAAATGGAT TTGCCTGGCA GGCACTTGAA GATATTAGTC TAAATCTCAG AAACAGAATA 11280 
TQATCTQAAG CTCTAAATTT 6TGATATTCA ATATAAATAC TTTAGAGTCA TTGGGATAAA 11340 
TATGGTAGTT OTAfiCTAAAA QCAAAAATAA GATACTAGGG AGAAAGGATA AAGTTAGAAG 11400 
AAAGAAGAAT CXA6AATTQA CCTTGAAGTA TATCAOCATG TGTAAAGATC AGGAATTGAT 11460 
CATTTTTATT TTGCAaAAAG TAGCTTTTCT TAGGGTTCCA TATTTACTCC CATAGATTCT 11520 
TCCC 

Seq ID NO: 465 Protein sequence 
Protein Accession #t BAB2 1525.1 

1 11 21 31 41 51 

I I 1 I I i 

MNSLSEANTK FMFDLFQQPR KSKENNIFYS PISITSALGM VLLGAKDNTA QQISKVLHPD 60 

QVTENTTEKA ATYHVDRSGN VHHQFQKLLT BFNRSTDAYB LKIANKLFGE KTYQPLQEYL 120 

DAZXKFYQTS VBSTDFAMAP EE5RXKINSW VE8QTNEKIK NLFFD6TZGI7 DTTLVLVMAI 180 

ypKGQtfBNRF iCKENTKEEKP NFNKNTyKSV QMMRQVN&EN FALIiEDVOAK VLEZpyRGED 240 
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LSMIVLLPNE IDGLQKItEEK LTABXLHEWT SLQNMRETCV DLHLPRFKME BSYDIiKDTLR 300 

TMGMVNZFNG DADLSGMTWS HSLSVSKVLH XAFVEVTEEG VEAAAATAW WELSSPSTN 360 
EEFCCNHPFL FFZRQNXINS ILFYGRFSSP 

Seq ID NO: 466 DNA sequence 

Nucleic Acid Accession NM_001910.1 

Coding sequence t SO.. 12 40 

1 11 21 31 41 51 

1 I I I I I 

GQAGAGAA6A AAGGAGGGGG CAAGGGAGAA 6CTGCTGGTC GGACTCftCAA TGAAAACGCT 60 

CCTTCTTTTG CTGCTGQTGC TCCTGGAGCT GGGAGAOGCC CAAGGATCCC TTCACAGGGT 120 

6CCCCTCAGG AGGCATCC6T CCCTCAAGAA GAAGCTG066 GCACGGAQCC AGCTCTCTGA IBO 

GTTCTGGAAA TCCX3VTAATT TGGACATGAT CCAGTTCACC QAGTCCTGCT CAATGGACCA 240 

GAGTGCCAAG GAACCCCTCA TCAACTACTT GGATATGGAA TACTTCGGCA CTATCTCCAT 300 

TGGCTCCCCA CCACAGAACT TCaCTGTCAT CTTCGACACT GGCTCCTCCA ACCTCTGGGT 360 

CCCCTCTGTG TACT6CACTA GCCCAGCCT6 CAAGAC3QCAC AGCAGGTTCC A6CCTTCCCA 420 

GTCCAGCACA TACAGCCAGC CAGGTCAATC TTTCTCCATT CAGTAT6GAA CCGGGAGCTT 480 

GTCCGG6ATC ATTGGAGCCG ACCAAGTCTC T6TGQAAGGA CTAACCGTGG TTGGCCAGCA 540 

GTTTGGAGAA AGTGTCACAG AGCCAGGCCA GACCTTTGTG GATGCA6AGT TTGATGGAAT 600 

TCTGGGCCTG GGATACCXTCT CCTTGGCTGT GGGAGGAGTG ACTCCAGTAT TTGACAACAT 660 

GATGGCTCAG AACCTaOTOQ ACTTOCOGAT GTTTTCTGTC TACATGAGCA GTAACCCAGA 720 

AGGTGQTGCQ OGQAOCSQAGC TGATTTTTGQ AGGCTACGAC CACTCCCATT TCTCTGGGAG 780 

CCTGAATTGG GTCCCAGTCA CCAAGCAAGC TTACTGGCAG ATTGCACTGG ATAACATCCA 840 

GGTGGGAGGC ACTGTTATGT TCTGCTCCGA GGGCTGCCAG GCCATTGTGG ACACAGGGAC 900 

TTCCCTCATC ACTGGCCCTT CCX3ACAAGAT TAAGCAGCTG CAAAACGCCA TTGGGGCAGC 960 

COCOGTGGAT GGAGAATATG CTGTGGAGTG TGCCAACCTT AACGTCATGC CGGATGTCAC 1020 

CTTCACCATT AAOGGAGTCC CCTATACCCT CAGCCCAACT GCCTACACCC TACTPGGACTT 1080 

0GTGGAT6GA ATGCAGTTCT GCAGCAGTGG CTTTCAAGGA CTTGACATCC ACCCTCCAGC 1140 

TGGGCCCCTC TGGATCCTGG GGGATGTCTT CATTCGACAG TTTTACTCAG TCTTTGACOG 1200 

TGGGAATAAC CXSTGTGGGAC TGGCCCCAGC AGTCCCCTAA GGAGGGGCXTT TGTGTCTGTG 1260 

CCTGCCTQTC T6ACAGACCT TGAATATGTT AGGCTGGGGC ATTCTTTACA CCTACAAAAA 1320 

GTTATTTTCC A6AGAAT6TA 6CTGTTTCCA GGGTTGCaAC TTGAATTAAG ACCAAACAGA 1380 

ACATGAGAAT ACACACACAC ACACACATAT ACACACACAC ACACTTCACA CATAC ACACC 1440 

ACTCCCACCA COSTCATGAT GQAGGAATTA OSTTATACAT TCATATTTTO TATTQATTTT 1500 

TGATTATGAA AATCAAAAAT TTTCACATTT GATTATGAAA ATCTCCAAAC ATATOCACAA 1560 

GCROAOATCA TGGTATAATA AATCCCTTTG CAACTCCACT CAGCCCTGAC AACCCATCCA 1620 

CACAOSGCCA GGCCTGTTTA TCTACACTGC TGCCCACTCC TCTCTCCAGC TCCACATGCT 1680 

GTACCTGGAT CATTCTGAAG CAAATTCCGA GCATTACATC ATTTT6TCCA TAAAT ATTTC 1740 

TAACATCCTT AAATATACAA TCGGAATTCA AGCATCTCCC ATTOTCOCAC AAATGTTTGG 1800 

CTGTTTTTGT AGTTGGATTQ TTTGTATTAG GATTCaAGCA AGGCCCATAT ATTGCATTTA 1860 

TTTGAAATGT CTGTAAGTCT CTTTCCATCT ACAGAGTTTA GCACATTTGA ACX3TTGCTGG 1920 

TTGAAATCCC GAGGTGTCAT TTGACATGGT TCTCTGAACT TATCTTTCCT ATAAAATGGT 1980 

A6TTAGATCT GGAGGTCTGA TTTTGrGGCA AAAATACTTC CTAGGTGGTG CT GGGT ACTT 2040 

CTTGTTGCAT CCTQTCAOQA GOCAGATAAT GCTGOTOCCT CTCTATTGGT AAT OTTAAG A 2100 
CTGCTGGGTG GGTTTGGA6T TCTTGGCTTT AATCATTCAT TACAAAOTTC AGCATTTT 



Seq ID NO: 467 Protein sequence 
Protein Accession #t NP_001901.1 

1 11 21 31 41 51 

I I I I I t 

MKTLLLLLLV LLELGEAQGS LHRVPLRRHP SLiaOOiHARS QLSEFWKSHN LDMIQPTESC 60 
SMDQSAKEPL INYLDMBYPG TISIGSPPQN FTVIFDTGSS NLWVPSVYCT SPACKTHSRP 120 
QPSQSSTY8Q PGQSFSIQYG TGSLSGIXGA DQVSVEGLTV V6QQF6BSVT EFGQTFVDAB 180 
FDGILGLGYP SLAVGGVTFV FDNMMAQNIiV DLPMFSVYMS SNPB6GAGSB LIFGGYDHSK 240 
PSGSLNWVPV TKQAYWQIAL DNIQVGGTVM FCSBGOQAIV DTGTSLIT6P SDKIKQU3NA 300 
IGAAPVDGEY AVBCANUJVM PDVTPTINGV PYTLSPTAYT UiDPVDGMQF CSSGFQGLDI 360 
HPPAGPLWXIi GDVFIRQPYS VFDRGNNRVQ LAPAVP 



Seq ID NOt 468 DNA sequence 

Nucleic Acid AccesBion ttt NM_018058.1 

Coding sequence! 319.. 1575 

1 11 21 31 41 51 

I I I I I I 

TACGC6CTGC GGGACCGGCA GGGGAACGCC ATCGGGGTCA CAGCCTGCGA CATC6AGGGG 60 

GACGGCOGGG AGGAGATCTA CTTCCTCAAC ACCAATAATG CCTTCTCGGO GGTQGCCAOO 120 

TACACCGACA AGTTGTTCAA GTTCCGCAAT AACCGGTGGG AAGACATCCT GAGCQAT6A6 180 

GTCAACGTGG CCCQTGGTGT GGCCAGCCTC TTTGCCX3GAC GCTCTGTGGC CTGTGTGGAC 240 

AGAAAG06CT CTGGACGCTA CTCTATCTAC ATTGCCAATT AOGCCTACGG TAATGTGGGC 300 

CCTGATGCCC TCATT6AAAT GGACCCTGAG GCCAGT6ACC TCTCCOGGGG CATTCTGGCG 360 

CTCAGAGATG TGGCTGCTOA GGCTGGGGTC AGCAAATATA CAGGQGGOOS AGGCGTCAGC 420 

GTGGGCCCCA TCCTCAGCAG CAGTGCCTCG GATATCTTCT GCGACAATGA GAATG6GCCT 480 

AACTTCCTTT TCCACAAOCG GGGCGATGGC ACCTTTGTGG ACGCTGCGGC CAGTGCTGQT 540 

GT6GACGACC CCCACCAGCA TGGGCGAGGT GTCGCCCTGG CTGACTTCAA CCGTGATGGC 600 

AAAGTGGACA TCGTCTATGG CAACTGGAAT GGCCCCCACC GCCTCTATCT GCAAATQAGC 660 

ACCCATGGGA AGQTCCGCTT COGQGACATC OCCTCACCCA AOTTCTCCAT GCCCTCCCCT 720 

GTCCGCACGG TCATCACCGC COACTTTGAC AAT6ACCAGG AGCTGGAGAT CTTCTTCAAC 780 

AACATTGCCT ACCGCAGCTC CTCAGCCAAC CGCCTCTTCC GOGTCATCCG TAGAGAGCAC 840 

G6AGACCCCC TCATCQAGGA GCTCAATCCC GGCGACGCCT TGGAGCCTGA GGGCCGGGGC 900 

ACA06GGGTG TG6TGACCGA CTTCGACGGA GACGGGATGC TGGACCTCAT CTTGTOCCAT 960 

GGAGAOTCCA TGGCTCAGOC GCTOTCOGTC TTCCGGGGCA ATCAGGGCTT CAA CAACAA C 1020 

TGGCT6G8A6 TG6T6CCAC6 CACOCGGGTT G6GG0CTTTG CCAGGGQAOC TAAGGTCGTO 1080 

CTCTACACCA AGAAGAGTGG GGCCCACCTG AGGATCATCG AC66GGGCTC AGGCTACCTG 1140 

TGTGAGATGG AGCCOGTGGC ACACTTTGGC CTGGGGAAGG ATGAAGCCAG CAGTGTGGAG 1200 

GIGAG6TGGC CAGATGGCAA 6ATGGTGAGC CG6AACGTG6 CCA6CGGG6A 6ATGAACTCA 1260 



365 
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GTGCTGOAfSA TCCTCTACCC COaGQATGAG GACACACTTC AGGACCCAGC CCCACTGGAG 1320 

ACACCAATQA ATGCATCCftG TTCCCATTG6 TGT6CGCTC6 AGACAAGCCC GTAT6TGTCA 13 BO 

ACACCTATOG AAGCTACAGG TGCCGGACCA ACAAGAAGTG CAGTCGGGGC TACGAGCCCA 1440 

ACGAQGATGO CACAQCCTGC GTGGGGACTC TCGGCC3W3TC ACCX5GGCCCC CGCCCCACCA 1500 

CCCCCACCGC TGCTGCTGCC ACTGCCGCTG CTGCTGCCGC TGCTGGAGCT GCCACTQCTG 1560 

CACCGGTCCT CGTAGATG6A GATCTCAATC TGGGGTCQGT GOTTAAGOAQ AQCTGCGAGC 1620 

CCAGCTGCIG AOGAGGGGTG G6ACATGAAC CAGOGGATGG AOTCCAGCAG GGGAQTGGQA 1660 

AAGTGGGCTT GTGCTGCTGC CTAGACAGTA GGQATOTAAA GGCCTGGGAG CTAQACCCTC 1740 

CCX»AGCCCA TCCATGCACA TTACTTAGCT AACAATTAGO GAGACTCGTA AGGCCAGGCC 1800 

CTGTGCTGGG CACATAGCTG TQATCACAGC AGACAGGGTC GCTGCCCTGA TGGC6CTTAC 1860 

ATTCCAGTGG GTCTAATGAC CATATCTTAO GACACAGATG TGCCCAGGGA GGTGGTGTCA 1920 

CTGCACS^GGA AOXATGAOGA CTTTAGTQTC CTGAGTTCAA ATCCTGATTC AGGAACTCAC 1980 

AAAGCTATGT GACCTTACAC CAGTCACTTA ACTTGTTAGC CATCCATTAT CGCATCTGCA 2040 

AAAT6GGGAT TAAGAATAOA ATCTTGGGGT TAOIGTGGAG ATTAGATTAA ATGXATGTAA 2100 

GACACTTG6C ACAAAACCTG GCACATAGTA AAGGCTCAAT AAAAACAA6T GCXTTCTCACr 2160 
6GGCTTTGTC AACACGTG 

Seq ZD NDt 469 Protein sequence 
Protein Accession MP_060S28.1 

1 11 21 31 41 SI 

11111) 

MDPEASDLSR GIIiALRDVAA EAGVSKYTGG R6V5VGFILS 8SASDIFCDN ENGPHFLFEN 60 

RGDGTFVDAA ASAGVDDFEQ HGRGVALADF NRDGKVDIVY GHVINGFHRIiY LQMSTHGKVR 120 

FRDIASPKFS MPSPVRTVIT ADFDNDQELE IFFHNZAYRS SSAKRLFRVI RREBSDPIiZE 180 

ELNPGDAIiEP EGRGTGGWT DFDGDGKLDL ILSHGBSMAQ PLSVFRGNQG FNNNHLRWP 240 

RTRVGAFARG AKWLYTKKS GAHLRIIDGG SGYLCEMEPV AHPGLQKDBA SSVBVTWPDO 300 

KMVSRNVASO EMNSVLBILY PRDEDTLQDP APLETPMHAS SSHSCALETS PYVSTPMBAT *360 
GAGPTRSAVG ATSPTRMAQP AHGLSASKRA PAPPPPPLLL PLPItLLPLXiE LPZiLHRSS 

Seq ID KO: 470 DNA sequence 
Nucleic Acid Accession $t AJ279016 
Coding sequence t 1..1962 

1 11 21 31 41 51 

I 1 I I 11 

ATGTCCAGGA TGTTACOGTT CCTGCTGCTQ CTCT6GTTTC TGCCCATCAC TQAGGGGTCC 60 

CA006GGCTG AACCCATGTT CACTGCAGTC ACCAACTCA6 TTCTGCCTCC TGACTATGAC 120 

A6TAATCCCA CCCAGCTCAA CTATGGTGTG GCAGITTACTG ATGTGQACCA TGATGGGGAC 180 

TTTGAGATCG TOOTGGOGGO QTACSU^TGGA CCCAAGCTGCt TTCTGAAGTA TQACGGGGCC 240 

CA6AAG0G6C TGOTGAACAT OGCGGTCXSAT GAGGGCAGCT CACCCTACTA 06GGCTGCGG 300 

GACaSGCAGG GGAAC3GCCAT CGOOGTCACA GCCTGOQACA T06ACQGGGA CGGCCGGGAG 360 

GAGATCTACT TCCTCAACAC CAATAATGCC TTCTCGGGGG TCGCCAOSTA CACCGACAAG 420 

TT6TTCAA6T TCGGCAATAA CCGGTGGGAA 6ACATCCTGA GCQATQAGGT CAAC6TGGCC 480 

a ritSGrGlt S G CCAGGCTCTT TQCCXSQAGGC TCT G TGGOCT GTGTGQACAG AAAGGQCTCT 540 

GGACXSCTACT CTATCTACAT TGCCAATTAC GCCTAOSGTA ATOTGG6CCC TGATGCOCTC 600 

ATTGAAATGG ACCCTGAGGC CAGTGACCTC TCCCGGGGCA TTCTGGCGCT CAGAGATGTG 660 

GCTGCTGAGG CTGGQGTCAG CAAATATACA GGGGGCCGA6 GOGTCAGCQT GGGCCCCATC 720 

CTCA6a«3CA GTGCCTCGGA TATCTTCTGC GACAATGAGA ATGGGCCTAA CTTCCTTTTC 780 

CAC3UUX:GGG GOGATGGCAC CTTTGTGGAC GCTGCQQGCA GTGCTGGTGT GGAOSACCCC 840 

CACCA6CATG GGOGAGGTGT GGCCCTGGCT 6ACTTCAACC GTOAtGGCAA AGTGGACATC 900 

GTCTATGGCA ACTGGAATGG CCCCCACOGC CTCTATCTGC AAATGAGCAC CCATGGGAAG 960 

GTCOGCTTCC GGQACATOGC CTCACCCAAG TTCTCCATGC CCTCCCCTGT COGCAC36GTC 1020 

ATCACCX5CCG ACTTTCACAA TQACCAG6A6 CTGGAGATCT TCTTCAACAA CATTGCCTAC 1080 

GGCA8CTCCT CA0CCAAC06 CCTCTTCOGC GTCATOOGTA GAGAGCA06G AGACCCCCTC 1140 

AT0GAGGA6C TCAATCCCX^G OQAGGCCTTG GAGCCTGAGG GCCGGGGCAC AGGGGGTGTG 1200 

GTGACCGACT TOQAOGGAGA CSGGGATGCTG GACCTCATCT TGTCCCATGG AGAGTCCATX3 1260 

GCTCAGCCOC TGTCCGTCTT CCGGGGCAAT CAGGGCTTCA ACAACAACTG GCTGCX3AGT6 1320 

GTGCCA06CA CCCGGTTTGG GGCCTTTGCC AQGGGAGCTA AGGTC6TGCT CTACACCAAQ 1380 

AAQAOTGGGG CCCACCTGAG GATCATCGAC GGGG6CTCAG GCTACCTGTG TGAGATGGAG 1440 

CC O GTGGCAC ACTTTGGCCT GGGGAAGGAT GAAGCCAGCA GTQTGGAGOT GAOGTQGCCA 1500 

GATGGCAAGA TGGTGAGCCG GAAC8TGGCC AGCGGGGAQA TGAACTCA6T GCT6GAGATC 1560 

CTCTACCCCC GGQATGAGGA CACACTTCAG GACCCAGCCC CACTGGAGTG TGGOCAAGGA 1620 

TTCTCCCAGC AOQAAAATOG CCATTQCATQ GACACCAATG AATGCATCCA GTTCCCATTC 1680 

GTGTGCCCTC QAQACAAGCC CGTATGTGTC AACACCTATG GAAGCTACAG GTGCCGGACC 1740 

AACAAGAAGT GCAGTCGGGG CTAOGAGCCC AACGAGGATG GCACAGCCTG OGTGGGQACT 1800 

CTOQGCCAGT CACCGGGCCC CCGCCCCACC ACCCCCACOG CTGCTGCTGC CACTGCGGCT 1860 

GCTGCTGCCG CTGCTGGAGC TGCCACTGCT GCACOGGTCC TCGTABATGG AOATCTCAAT 1920 

CTGGGGTOGG TGGTTAAGGA GA6CTGCGAG CCCAGCT6CT GAGCSkGGGGT GGGACATGAA 1980 

CCAGC3GGATG GAGTCCAGCA GGGGA6TGG6 7UUW3TGGGCT TQTGCTGCTG CCTAGACAGT 2040 

AGGGATGTAA AGGCCTGGGA GCTAGACCCT CCCCAA6CCC ATCCATQCAC ATTACTTAGC 2100 

TAACAATTAG GGAGACTGGT AAGOCCAGGC CCTGTGCT66 GCACATAGCT GTGATCACAG 2160 

CAGACAGGGT GGCTGCCXriG ATGGOQCTrA CATTCCAGTG 6GTCTAATGA GCATATCTTA 2220 

GGACACAGAT 6TGCCCAGGG AGGTGGTQTC ACTGCAC3U3G AAGTATGAGG ACTTTAGTGT 2280 

CCTGAGTTCA AATCCTGATT CAGGAACTCA CAAAGCTATG TGACCTTACA CX»GTCACTT 2340 

AACTTGTTAG CCATCCATTA TCGCATCTGC AAAATGGGGA TTAAGAATAG AATCTTGGGG 2400 

TTAGTGTGGA GATTAQATTA AATGTATGTA ASAC ACTTG G CACAAAACCT GGCACATAGT 2460 
AAAG6CICAA TAAAAACAA6 T6CCTCTCAC TGG6CTTTGT CAACAOG 

Seq ID NOx 471 Protein sequence 
Protein Accession #t CAC08451 

1 11 21 31 41 51 

I ] I I 1 I 

MSRMLPFLLL IiWPLPITEGS QRAEPMFTAV TNSVLPPDYD SNPTQLNYGV AVTDVDHDGD 60 

FEIWAGYNG PNLVLKTORA QKRLVMIAVD ERSSPYYAIiR DRQGNAIGVT ACDIDGDGRE 120 

EIYFIJmnZA PSGVATYTDK LFKFRHNRWB DILSDEVNVA R6VASLFAGR SVACVDRKGS 180 
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(31YSIYIANY AYGNV6PDAL lEMDPSASDL SRGILALRDV 
LSSS2VSDIFC DNENQPNFLF HNRGDGTFVD AAASASVDDF 
VYGNNNGPHR LYLQMSTBGK VRFRDIASPR PSMPSPVRTV 
RSSSAKRLFR VIRREHGDPL IEELNP6DAL EPB6RGTGGV 
AQPLSVPRGN QGFNNNWLRV VPRTRFGAPA RGAKWLYTK 
FVAHFGLOKD BASSVEVTWP DGKMVSRNVA SGBMNSVXiEI 
FSQQENGHCM DTKBCIQFPF VCPRDKPVCV NTYGSYRCRT 
LGQSPGPRPT TPTAAAATAA AAAAAGAATA APVLVD6DUI 

Seq ID KOs 472 DNA sequence 
Nucleic Acid Accession #i FGENESKH 
Coding sequence t 1 . . 47 94 



PCT/US02/12476 



AAEAGVSKYT 
BQHGR6VAIA 
ZTADFDNDQE 
VTDPDGDGML 
KSGAHLRIID 
LYPRDEDTI^ 
NKKCSRGYEP 
LGSWKESCE 



ATG6CGT6TC 
AGOGGCTCCT 
6TTCTQAAGT 
TCACCCTACT 
AT06ACGGG6 
CACAGCAGCT 
CCACCTACAA 
TCCTCCCTQG 
TGTCGGGGTG 
GGGGTGGCCA 
CTGAGOSA^rG 
GCCTGTGTGG 
GGTAATGTGG 
GGCATTCTGG 
TTCTCCCACA 
6GAGGAGACC 
TGCCGGCTGG 
CAGA6GGAGG 
TCCAAAAGCC 
GCGOCTTCTC 
OCCCTTGTCA 
CCCCACCCCC 
CTGATGGCTG 
CTGAGAA6CT 
GAGCTGGGAd 
CTGGGAGAAC 
CCCAAGGTCA 
GGCCCCGGGA 
CTCTCCCaTC 
GTQCOOGQAG 
CTG6CGTGGA 
TTTAG6CTCA 
CTGCAGTTCC 
TCTGCCACTC 
ATCCTCAGCA 
TTCCACAACC 
GCCTTCATOG 
CTAGCAGAAA 
CCACATTQCX; 
TTCTTGACX3C 
CftOGGGGCCC 
ACTGCCTATT 
TTGTCCTCTG 
GCCCTGGCTG 
CCCCACCGCC 
TCACCCAAGT 
6ACCAGGAGC 
CTCTTCOGAT 
GGTCAGGGAG 
AAGGTCAACA 
A6AGGCTGTO 
AAAGGGAAGG 
OCACACIAOC 
GTCCAATCAC 
CGGGGTCCAA 
GCTACGGGCT 
AGGGGCTAGG 
AGAAAGGGGC 
CCAGGAAAA6 
ACTACCAGGA 
AATCACTACC 
GTCCAATCAC 
CGGGGTCCAA 
GCTATGGGGT 
AGGGGCTATG 
AGAGAGCACG 
GGCCGGGGCA 
TTGTCCCATG 
AACAACAACT 
AA6GTCGTGC 
GGCTACCTGT 
AGTGTGGAGG 



11 
I 

CGGGAGGACT 
CCCCMSCATC 
ATQACOGQQC 
ACGCGCT6CG 
ACGGCCGGGA 
CAGCGCAGGT 
CCCCTGCAQG 
GTCAGGCTTC 
GACTGAGACC 
OGTACACCGA 
AG6TCAA0G7 
ACAGAAAGGG 
GCCCTGATGC 
C6CTCA6AGA 
CTGCCTCTCC 
CAGAGGAGGC 
GCTGGAAGGA 
CTGGGGCAGC 
ATTTGGCTGA 
CAQCCCACCC 
CTCAGCTAAT 
GAGCCCCAGG 
AGGCTTTGGG 
GGGAGGAAAG 
GTCCCTGGAG 
CTCCCATTTT 
CACAGGAGTG 
GGGTGGCCAA 
CCCTGGTCCC 
CT6GCCTGCC 
ACCAGATGGA 
GGAAAGCACG 
CCTCAGGCCT 
ACTGTGGGTC 
GCAGTGCCTC 
GGGGCOATGO 
TTCACCTCAA 
CTGQTCCTTC 
ATCATGGTTT 
AAGGCTTG6C 
CACCCTGCCT 
ACATTGTCCT 
AAA6AGTCAA 
ACTTCAACCG 
TCTATCTGCA 
TCTCCATGCC 
TGGAGATCTT 
GCTCCATCCT 
AAGGTTTAAG 
CAGGTCCCCT 
6QAATGCAGG 
GAAATGTGGC 
ACftAAAAGGG 
TACCAGGAAA 
TCACTACCAG 
CCAATCACTA 
GGCTCCAATC 
TAOGGGCTCC 
GGGCTACAGG 
AAAGGGGCTA 
AGGAAAAGGG 
TACCACAGAA 
TCACTACCAG 
CCAATCACTA 
GGGTCCAATC 
GAGACCCCCT 
CA6GGGGTQT 
GA6AGTGCAT 
GGCTGCOAGT 
TCTACACCAA 
GTGAGATGGA 
TGAGGTGGCC 



CGTGGGTGTG 
TGATGGCAAA 
AATGAGCACC 
CTCCCCTGTC 
CTTCAACAAC 
GGCTCGTGGC 
AATCAGAAGG 
GATGAAGAAA 
GCAAAGCCTG 
CCAAAGTGTG 
6CTACAGGGT 
AGGGGCTAC6 
GAAAAGGGGC 
CCAGGAAAAG 
ACTACCAGGA 
AATCACTACC 
GTCCAATCAC 
CGGGGTCCAA 
GCTACAGGGT 
AGGGGCTAOG 
GAAAAGGGGC 
CCAGGAAAAG 
ACTACCACAQ 
CATCGAGGAO 
G6TGACCGAC 
GGCTCAGCC6 
GGTGCCAGGC 
GAAGAGTGGG 
GCCCGTGGCA 
AGATGGCAAG 



CTAGGGGGCC 
TOCGACAATG 
GACGCTOCGO 
TGCAGAGATT 
TGCCCGTGGC 
TTTACAAGGA 
CACCGGAGGA 
GCTCCCTGTG 
ATCCCAGAGA 
GACGACCCCC 
6TGGACATCQ 
CATGGGAAOG 
CGCACGQTCA 
ATTGCCTACC 
TCTTCATCCT 
GGAGGGTTCC 
CAGAAAGGAA 
GCCAAGGAGC 
CCCAGAACCC 
CCAATCACTA 
GGGTCCAATC 
TACGGGGTCC 
GGGCTACAGG 
AAAGGGGCTA 
AGGAAAAGGG 
TACCAGGAAA 
TCACTACCAG 
CCAATCACTA 
GGCTCCAATC 
TACGGGCTGC 
GGGCTACGG6 
AAAGGGGCTA 
CTCAATCCCG 
TTCGACGGAG 
CTGTCCGTCT 
ACCCGGTTTG 
6CCCACCTGA 
CACTTTGGCC 
ATGGTGAGCC 



41 

I 

GGATGGGACT 
GGTACAATGG 
TCGCGGTCGA 
TCGGGQTCAC 
CCAATAATGC 
ACAGGCCPGT 
TCAGCGGAAG 
AGAGGGTGCC 
TTCTTCTGAG 
ATAACOGGTO 
TCTTTGCCGG 
ACATTGCCAA 
A6GCCAGTGA 
TCAGCAAATA 
GCAGAACCGA 
AT6GAAGCAC 
CAGCAGCTTT 
TTCGAACAGC 
CATGTTACTA 
OCCAACACXA 
GAAAACTA6C 
GCCGCCATGC 
CCACTGTGGT 
TGTCCA6ATG 
CTGCTAQAGA 
CA6GQAG6AG 
CAGCTCTCGG 
CTGGGGCAGT 
GGCCTCTTGA 
TTCTQGACAT 
ATGGAOACCA 
GCTCCTCTGA 
A6GT6QGCCT 
QAGGCGTCAG 
AQAATGOOCC 
CCAGTOCTGA 
TTCCTCACTC 
ATGCACGTCT 
CGGGGTCAC6 
CACTCAGCCT 
TCCTGGGGTC 
GCCTGAT6AC 
ACCAGCATG6 
TCTATGGCAA 
TCCGCTTCOG 
TCACCGCCGA 
GCAGCTCCTC 
TGACAGCTGG 
CAGGGCCAGG 
GGAAGGACGA 
CGGCCTCTGC 
AAGOGCCACA 
CCAGGAAAAG 
ACTACCAGGA 
AATCACTACC 
GTCCAATCAC 
CAGGGTCCAA 
GCTAC6GGGT 
AGGGGCTAOG 
GAAAAGGGGC 
CCAGGAAAAG 
ACTACCAGGA 
AATCACTACC 
GTCCAATCAC 
CGGGGTCCAA 
GCGACGCCTT 
ACGGGATGCT 
TCCGGGGCAA 
GGGCCTTKaC 
GGATCATOGA 
TGGGGAAGGA 
GGAACGTGGC 



ggrgvsvgpi 
dfnrdgkvoi 
leipfnuiay 
dlilshgesm 
ggsgylceme 
dpaplecgqg 

NBD6TACVGT 
P8C 



51 

I 

GGGTGGGCCC 
ACCCAACCTG 
TGA608CAGC 
AGCCTGC6AC 
CTTCTCGGGC 
GCTGAAGCCT 
GGACTTTTCC 
GGTTCCCTGC 
ACCCAAATCA 
GGAAGACATC 
ACGCTCTOTG 
TTACGCCTAC 
CCTCTCCCGG 
TACAGAAG6C 
OGAGGGGGAA 
CA0CCAACT6 
GGTGGAGGAA 
TCTGCAGACT 
TTCTGTCTGC 
CCCTGTA6CC 
CCGGAGTGTC 
TGAGCCCGQC 
GCCAG6GGGC 
TGCACTCAGG 
GCTGTATGAC 
AA6GGACTCG 
GGGACTCGAG 
AGGAAGACCA 
A6CCGG6ACA 
GGCCAA6GCC 
T0AGCCCA8A 
GGAGCCTCTG 
GGGGCTTGCT 
CGTGGGCCCC 
TAACTTCCTT 
AOGTGGTTTA 
CCTGTGCCAC 
TCTTCAGGCT 
GTTCTATTCA 
CCAGGGTTCT 
TCTGATCCCC 
CCACAQCTAT 
GCGAGGTGTC 
CTQGAATGGC 
GGACATOGCC 
CTTTGACAAT 
AGCCAACCGC 
TGGGAGGAAC 
GGGTCAGGCC 
GGACTGG6CA 
TATTGCAGGG 
AGATACAAAG 
GOGCTAOGGG 
AAAGGGGCTA 
AGGAAAAGGG 
TACCAGGAAA 
TCACTACCAG 
CCAATCACTA 
GGGTCCAATC 
TACGGGGTCC 
GGGCTACAGG 
AAAGGGGCTA 
AGGAAAAGAG 
TACCAGGAAA 
CGTCATCCGT 
GGAGCCTGAG 
GGACCTCATC 
TCAGGGCTTC 
CAGGGGAGCT 
OGGGGGCTCA 
TGAAGCCAGC 
CAGCGGGGAG 



240 
300 
360 
420 
480 
540 
600 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780' 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
39S0 
4020 
4080 
4140 
4200 
4260 
4320 



367 



10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



WO 02/086443 

ATGAACTCAG TOCTOOAGAT CCTCTACCCC CGG6ATGAG6 
CCaCTGQAOT GTGOCCAAGG ATTCTCCCAG C3U3GAAAATG 
QAATGCATCC AGTTCCCATT OGTGTGCCCT CGAGACAAGC 
GGAAGCTACA GGTGC0C3GAC GAACAA6AAG TGCA6T0GG6 
GGGACAGCCr GCGT G OGTAC TGAGCTAGGC TCTA6GCATA 
CCCAAAAA6G AGCTGCAACT TTCCCAA06C ATCIGCAC3CC 
0C6G6TTGCC 6GCTGCT0CT CAAAA6AGCT CAGCTCCAGG 
CAGAAAGCTC CAGGTATTCC AOAAGCCCAA GTGTATGAAC 

8eq ID MO: 473 Protein sequence 
Protein Accession «t PQENESH predicted 



1 

I 

HACPG6LPAR 
SPYYAIiRDRQ 
PPTTFAGLLG 
GVATYTDKLP 
GNVGPDAIjIE 
QGDPEEADEB 
SKSELADKNL 
PHPRAPC^P 
ELGGPWSQAT 
QPGRVAKREI 
LAWNQMEKBE 



11 
I 

CSGNMGLGGP 
GNAIGVTACD 
LPPL56R0FS 
KFRKNRHEDI 
MDPEASDLSR 
HSGD6STSQL 

FGPFCyysvc 

KCK6RHAEPG 
QHLPARELYD 
GRET6AVGRP 



21 
I 



31 



AFIVELKYHL 
. FLTQGLASSA 
L6SERVNVGV 
8PKFSMPSFV 
GQGEGIjRIRR 
KGKCaiVAQSV 
RQPITTRKRG 
RX6LRAPITT 
NHYQSXQLQO 
AKOSNBYQER 
GRGTGGWTD 
KWLYTKKSG 
MNSVLSILYP 
GSYRCRINKK 
FGCRLUiKRA 



LGGRGVSVGP 
CRDFPKSLCH 
HRRTIiSLQGS 
DDPBQHGRGV 
RTVITADFDN 
GOFPGPGGQA 
PRTQAPQDTK 
Y0VQ5LPGKG 
RKRGYGVQSL 
PZTTRKRjQyR 
6LRAPITTRX 
FDGDGMLDLI 
AHLRIIDGGS 
ROEDTLQDPA 
CSRGYEPNBD 
QLQAAPSTLL 



IDGDGREBXY 
8SL6QASPDS 
LSDEVNVAR6 
GILALRDVAA 
CRIiGWKDGQF 
APSPAHPPFA 
LMAEALGAWP 
LGEPPILQRT 
LSHPLVFKFF 
FSLRKAREAE 
ZLSSSASDZF 
LAETGPSSSC 
QGAPPCLIiAR 
ALADFNRDGK 
OQELBIFRIN 
KVNT6PLMKK 
PHYHKKGLQG 
ATGSNHYQBK 
PQKGATGSUB 
VQSLPQKQAT 
R6Y6VQSLPQ 
LSHGESMAQP 
GYLCEMEPVA 
PLB0GQ6FSQ 
GTACVGTBLG 
QKAPGIPEAQ 



FLNTNNAFSG 
RQGERVPVPC 
VASLP/VGRSV 
EAGVSKYTEG 
XEBAAALVEE 
RQAPOKYPVA 
ALSTTWPGG 
DGDPGRRRDS 
SCLRPLEAGT 



CFWHARLLQA 
APCVL6SLIP 
VDIVYGNWNG 
IAYR8SSANR 
QKGRKDSDWA 
PXTTRKRGYG 
GLQGPITTRK 
YQEKGLRGPI 
GSNEYQHKGL 
RGAT6SNVIR 
LSVFRGHQGF 
HFGLGKDEAS 
QEN6HGMDTN 
8RHTHTWKPR 
VYBQDQE 



Seq ID NO: 474 DNA sequence 

Nucleic Acid Accession #> NN_003661.X 

Coding sequence t 1 . . 1152 



1 
I 

ATGAGTGCAC 
CAAAACGTTC 
GCTGCTGGCA 
AAG6AAAAAG 
GGATTOSTGG 
GACAACCTTG 
TACAGAAACT 
AGAAGGCTCC 
AATGTGGTGT 
CTG0C!AO0CT 
ATCACAGCCG 
ACACAAGCCC 
GAGTTTTTGG 
ACAOGAGGCA 
GTACOGCATG 
OAACAGGTGG 
AOGGATGTGG 
TCAAAGCACT 
CAGGAGCTGG 
CAAGAACTGT 



11 
1 

TTTTCCTTGG 

caagtgggac 
cx:atggaccc 
tqagcacaca 

CTGCTGCZfSA 
CAAGACAAAT 
GGTTTCTGAA 
GTGCCCTTGC 
CTQQCTCTCT 
TCACAfiAlQGG 
CTTTGACOGG 
AAGCCCAC6A 
GTGAGAACAT 
TTGGGAAGGA 
CCTCAGCCTC 
AQA6GGTTAA 
CCCCTGTAAG 
TACATGAGGG 
AGGAGAAGCT 
6A 



21 

I 

TGTGGGAGTG 
AGATACT6GA 
AGAGAGCA6T 
QAATCTGCTA 
ACTGOCCAGG 
GATCATGAAA 
AGA6TTTCCT 
AGATGGGGTT 
CAGCATTTCC 
AGGGAGCCTT 
GATTACCAGC 
CCTGOTCATC 
ATCCAACTTT 
CATCG6TGCC 
AQ6CCCC0G0 
TGAACCCA6C 
CTTCTTTCTT 
GGCAAAGTCA 
AAACATTCTC 



31 

1 

A6GGCAGAGG 
GATCCTCAAA 
ATCTTTATTG 
CTCCTGCTGA 
AATQA6GCAS 
6ACAAAAACT 
OGGTTGAAAA 
CAGAAGGTCC 
TCTGGCATCC 
GTACTCTTGG 
AGTA0CATS6 
AAAAGCCTTG 
CTTTCCTTAG 
CTCAGACGAQ 
QTCAGTCnOC 
ATCCTGGAAA 
GTGCTGGATG 
GAGACAGCTG 
AACAATAATT 



ACACACTTCA 
GCCATTGCAT 
CCGTATGTGT 
GCTACG AGCC 
CAATGAOGTG 
CC6TCTGGTC 
CTGCTCCCAG 
AAGATCAGGA 



41 

I 

VltKYDRAQKR 
HSSSAQVPSG 
CRGGLRPTKB 
ACVDRKGSGR 
FSHTASPSXG 
QRBAGAAGVP 
PLVTQZiHTBG 
LRSWEESRQK 
PKVTQBCHLV 
VPGAALFGNP 
LQPPSGIiRGS 
FHNRGDGTFV 
PHCHRGLSMS 
TAYYIVLWSA 
PHRIjYLQMST 
LFRCSILAR6 
RG06KAGQSL 
VQSLPGKGAT 
RGYGLQSLPG 
TTRKRGYGLQ 
RGPITTRKRG 
REHGDPLIEE 
NNNWIiRWPR 
8VBVTHPD6K 
ECIQFPPVCP 
PKKELQLSQG 



41 

I 

AAGCTGGAGC 
GTAAGCCCCT 
AQQATOCCAT 
CTGATAAT6A 
ATGASCTOOG 
GGCACGATAA 
GTGAGCTT6A 
ACAAAGGCAC 
TGACCCT06T 
AACCTGG6AT 
ACTAOQGAAA 
ACAAATTQAA 
CTGGCAATAC 
CCAGAGCCAA 
CAATCTCAGC 
TGAGCAGA6G 
TAGTCTACCT 
AGGAGCTGAA 
ATAAGATTCT 



6GACCCA6CC 
GGACAOCAAT 
CAACACCTAT 
CAACGAGGAT 
6AAACCAAG6 
CTTTTTCCTO 
CACCCTTCTC 
ATAA 



51 

1 

LVNIAVDERS 
LHRNRFVLKP 
PEPPLLRPKS 
YSIYIANYAY 



RSRVRTALQT 
RLAGKXARSV 
GQAMSRCALR 
ATMPALGGIiE 
GNWVIiDMAKA 
PVWVGLGIA 
DAAA8AERRL 
PTRTGSRFYS 
IPESLMTHSY 
HGKVRFRDIA 
SSSLTAGGRN 
AKEPASAIAG 
GSNHYQEK6L 
RGATGSNHYH 
SLPGKGATGS 
YGIiQSIiPGKE 
LNFGDALEPE 
TRPGAPARGA 
MVSRMVAS6E 
RDKFVCVNTY 
ICTPVHSFFIi 



51 
I 

GAGGGTGCAA 
CGGTGACTGG 
TAAGTATTTC 
GGCCT6GAAC 
TAAAGCTCTG 
AGGCCA6CAG 
GGATAACATA 
CACX»TCGCC 
CGGCATGGGT 
GGAitSTTQGGA 
6AAGTQGTGG 
GGAGGTQAGG 
TTACCAACTC 
TCTTCAGTCA 
TGAAAG06GT 
A6TCAAGCTC 
CGTGTACGAA 
GAAGOTGGCr 
GCAGGCX3GAC 



Seq ZD NOt 475 Protein sequence 
Protein Accession 0: HP 003652.1 



1 
I 

MSALPLGVGV 
KEKVSTQNLL 
YRNWFLKBFP 
lAPFragGSL 
BFLGENISNF 
EQVERVNBPS 
QELEEKUnii 



11 
I 

RAEBA6ARVQ 
LLLTDNEAUN 
RLKSELEDNI 
VLLEP(9IELG 
LSIiAufJTYQL 
ZI1EM8ROVKL 
NMNYKILQAD 



21 
I 

QHVPSGTDTG 
GFVAAAEZ^R 
RRLRAUa)6V 
ITAALTGZTS 
TRGXGKDIRA 
TDVAPVSPFL 
QEL 



31 

I 

DFQSKPIiGDIf 
NBADELRKAL 
QKVHKGTTIA 
STMDYGKKWH 
XiRRARANLQS 
VLDWYLVYE 



41 51 

I I 
AAGTMDFESS IFIEDAIKYF 
DKLARQMIMK DiQWHDKGQQ 
NWSGSI18IS SGILTLVOIO 
TQAQAHDLVI KSLDKLKEVR 
VPHASA8RFR VTEPISAESG 
SKHLHEGAKS ETAEELKXVA 



4380 
4440 
4S00 
4560 
4620 
4680 
4740 
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60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
760 
840 
900 
960 
1020 
1080 
1140 



60 
120 
180 
240 
300 
360 



Seq ID KOi 476 DtIA sequence 

Nucleic Acid Accession #: NN_014452.1 

Coding sequence 1 1..1968 " 



11 



21 



31 



51 



368 



wo 02/086443 

I I t I I I 

ATGGGGACCT CTCCGAGCAG CAQCACCGCC CTCGCCTCCT GCAOCCGCAT CGCCCGCCQA 60 

GCCACAGCCA CGATGATCGC GGGCTCCCTT CTCCTGCTTG GATTCCTTAG CACCACCACA 120 

GCrCASCCAG AACA6AAGGC CTGGAATCTC ATTGGCACAT ACCGCCATGT T6ACC6TGCC IBO 

ACC3GGCXAG6 TGCTAACCTG TGAC3UU3TGT CCAOCAGOAA CCTATGTCTC TGAGOVTTGT' 240 

ACCAACACAA GCCTGGGCGT CTGCAGCAGT TGCCCTGTGG 66ACCTTTAC CAGGCATGAG 300 

AATGGCATAG AGAAATGCCA TGACTGTAGT C3W3CCATGCC CATGGCCAAT GATTGAGAAA "tl'seo 

TTACCTTGT6 CTGCCTTGAC TGACCGAGAA TGCACTTGCC CACCTGGCAT GTTCCAGTCT 420 

AAOGCTACCT GTGCCCCCCA TAGGGTGXGT GCTGTGGGTT GGGGTrGTGCG GAAOAAAGGG 480 

ACAGAOACTO AGGATGTGGG GTOTAAGCAO TOTGCTOGGG GTACCrTCTC AOATGIGCCT 540 

TCTAGTGTGA TCAAATGCAA A6CATACACA GACTGTCTGA GTCAGAACCT GOTGOTGATC 600 

AAGCOGGGGA CCAAGGAGAC AGACAACGTC TGTGGCACAC TCCCX3TCCTT CTCCAGCTCC 660 

ACCTCACCTT CCCCTGGCAC AGCCATCTTT CCAOGCCCTG AQCACATGGA AACCCAT6AA 720 

GTCCCTTCCT CCACTTATGT TCCCAAAGGC ATGAACTCAA CAGAATCCAA CTCITCTGCC 780 

TCTGTTAGAC CAAAGGTACT OAGTAGCATC CAGaMUSGGA CAGTCOCTGA CAACACAA3C 840 

TCAGCAAGGG GGAAGGAAGA CGTGAACAA6 ACXTCTCCCAA ACCTTCAGGT AGTCAACCAC 900 

CAGCAAGGCC CCCACCACAG ACAC31TCCTG AAGCTGCT6C CX5TCCATGGA GGCCACTGGG 960 

GGGGAGAAGT CCAGCACGCC CATCAAGGGC CCCAAGAGGG GACATCCTAG ACAGAACCTA 1020 

CACAAGCATT TTGACATCAA TGAGCATTTG OXTTGOATGA TTGTG Crm ' GCTGCTGCTG 1080 

GT6CTTGTGG TGATTGTGGT GTGCAGTATC OGGAAAAGCT OGAGGACTCT GAAAAAGGGO 1140 

CCC06GCAGG ATCCCA6T6C CATTGTGGAA AAGGCAGGGC TGAAGAAATC CATGACTCX:A 1200 

ACCCAGAACC GGGAGAAATG GATCTACTAC TGCAATGGCC ATGGTATCGA TATCCTGAAG 1260 

CTTGTAGCAG CCCAAGTGGG AAGCCAGTGG AAAQATATCT ATCAGTTTCT TTGCAATGCC 1320 

AGTGAGAGGG AGGTT6CTGC TTTCTCCAAT GGGTACACAG CCGACCACGA GCGGGCCTAC 1380 

GCAGCTCTOC AOGACTOGAC CATCOQGOGC CCCOAGGCCA GCCTGGCXICA GCTAATTAGC 1440 

GCCCTGC8CC AGCACGQGAG AAAOQATGTT GTGGAGAAGA TTOQTGGQCT GATGGAA5AC 1500 

ACCACCCAGC TGGAAACTGA CAAACTAGCT CTCCOGATGA GCCCCAGCCC GCTTAGCCCG 1S60 

AGCCCCATCC CCAGCCCCAA CGCGAAACTT GAGAATTCOG CTCTCCTGAC GGTGGAGCCT 1620 

TCCCCACAGG ACAAGAACAA GGGCTTCTTC GTGGATGAGT CGGAGCCCCT TCTCasCTGT 1680 

GACTCTACAT CCAGCGGCTC CTCCGCX3CTG AGCAG6AACG GTTGCTTrAT TACCAAAGAA 1740 

AAGAAGGACA CAQTGTTGCG GCAGGTAOGC CTGGACCCCT GTGACTTGCA GCCTATCTTT 1800 

GATGACATGC TCCACTTTCT AAATCCTGAG 6AGCTGCGGG TGATTGAAGA GATTOCCCAG 1860 

GCTGAGGACA AACTAGACCG GCTATTG6AA ATTATTG6AQ TCAAGAGCCA GGAAGGCASC 1920 
CAGACCCTCC TCGACTCTGT TTATAGCCAT CTTCCTGACX: TGCTGTAG 

Seq ID NO I 477 Protein sequence 
Protein Accession #: NP_055267.l 

1 11 21 31 41 51 

i I I 1 1 I 

MGTSPSSSTA lASCSRIARR ATATMIAGSL LLLGFLSTTT AQPEQKASNL IGTYRHVDRA 60 

TGQVLTCDKC PAGTYVSEHC TNTSLRVCSS CPV6TFTRHE NGZBKCHDCS QPCFWPHIEK 120 

LPCAALTDRE CTCPPGNFQS NATCAPHTVC PVQW0VRXK8 TETHDVRCKQ CARGTFSDVP 180 

SSVMKCKAYT DCLSQMLWI KPGTKETDNV CX3TLPSFSSS TSPSPGTAXP PRPEBMETHB 240 

VPSSTYVPKG MHSTESNSSA SVRPKVIiSSI QEGTVPDKTS SARGKEDVHK TLPNXjQWNH 300 

QQGPHHRHIL KLLPSMEATG GEKS8TFIKG PKRGHPRQm. HKHFDINEHIi PWKIVLFIiLL 360 

VLWIWCSI RXSSRTLKKG PRQDPSAIVE KAGUCKSMTP TQNREKHXYY CHGHGIDILK 430 

LVAAQVGSQH KDIYQFLCaiA SBREVAAFSN 6YTAQBBRAY AALQBWTZRG PBASLAQI.I8 480 

ALRQHSRNDV VEKIR6LKED TTQLETDKLA IiPMSPSFIiSP SPZPSFNAKL ENSALLTVEP 540 

SFQDKKIQSFF VDBSBFLIiRC DSTSS6SSAL 8RNGSFITKE XKDTVIiRQVR LDPaSLQPZF 600 
DDNLBFUIPB ELRVZEEIPQ AEDRLDRLFE IIGVKSQEAS QTLLDSVySH LPDIiL 

Seq ID MO: 478 ONA sequence 
Nucleic Acid Accession #s XM_044533 
Coding sequencet 238.. 2751 

1 11 21 31 ' 41 51 

1111)1 

OCTCTGCCCA AGGOQAGGCT G0GGGGC0G6 CX3CCG6CX3GG A6GACTGCGG TGCCCXIGOGG 60 

AGGGGCTQAG TTTGCCAGGG CCCACTTGAC CCTGTTTCCC ACCTCCCGCC CCCCAGGTCC 120 

GGAGGCGGGG GCCXXX:GGGG CGACTCGGGG GCGGACCGCG GGGCGGAGCT GCCQCCCGTG 180 

AGTCCGGCCG AGCCACCTGA GCCOGAGCCG CGGGACAC?CG TCGCTCCTGC TCTCCGAATG 240 

CTGC6CACCG CGAT6GGCCT GAGGAGCTGG CTCX3CCGCCC CATGGGGC3GC GCT0CC3QCCT 300 

CGGCCACCGC TGCTGCTGCT CCTGCTGCTG CTGCTCCTGC T6CAGCCQCC GCCTCCQAOC 360 

TGGGOQCTCA GCCCCCGGAT CAGCCTGCXTT CTGGGCTCTG AAGAGCX3GCC ATTCCTCAGA 420 

TTCGAAQCTQ AACACATCTC CAACTACACA GCCCTTCTGC TGAGCAGGGA TGGCAGGAOC 480 

CTGTAC6TGG GTGCTCGAGA GGCCCTCTTT GCACTCAGTA GCAACCTCAG CTTCCTGCCA 540 

GGOGGGGAGT ACCA6GAGCT GCTTTGGGGT GCAGAOGCAG AGAAGAAACA GCAGTGCAGC 600 

TTCAAGGGCA AGGACCCACA GCGCGACTGT CAAAACTACA TCAAGATCCT CCTGCCGCTC 660 

AGGGGCA6TC ACCTGTTCAC CT6TGGCACA GCAGCCTTCA GCCCCAT6TG TACCTACATC 720 

AACATG6A6A ACTTCACCCT GGCAAGa<^ GAQAAG6GGA ATGTCCTCCX GGAAQAXGGC 780 

AAGGGCCGTT GTCTCTTCGA CCCGAATTTC AAGTCCACTG CCCTGQTGGT TOATGGCGAG 840 

CTCTACACTG GAACAGTCAG CAGCTTCCAA GGGAATGACC CGGCCATCTC GCGGAGCCAA 900 

AGCCTTOSCC (XACCAAGAC OGAGAGCTCC CTCAACTGGC TGCAA6ACCC AGCTTTTGTG 960 

OCCTCAGCCT ACATTCCTGA GAGCCTGGGC AGCTTGCAA6 GCGATGATGA CAAGATCTAC 1020 

TTTTTCTTCA GCC»GACTGO CCAGQAATTT GAOTTCITTG A6AACACCAT TGTGTCCCGC 1080 

ATTGCCC6CA TCTGCAAGGG CGATGA6GGT GGAGAGCGGG T6CTACAGCA GCX3CTGGACC 1140 

TCCTTCCTCA TiQGCOCAGCT GCTGTGCTCA CGGCCCGACG ATGGCTTCCC CTTCAAOSTG 1200 

CTGCAGGATG TCTTCACGCT GAGCCCCAGC CCCCAGGACT GGCGTGACAC CCTTTTCTAT 1260 

GGGGTCTTCA CTTCCCAGTG GCACAGGGGA ACTACAGAAG GCTCTGCCGT CTGTGTCTTC 1320 

ACAATGAAGG ATGTGCAGAG AGTCTTCAGC GGCCTCTACA AGGAGGT6AA COGTGAGACA 1380 

CAGCAGTGGT ACACCGT6AC CCACCCGGTG CCCACACCCC GGCCT06AGC GTGCATCACC 1440 

AACAGTGCCC GGGAAAG6AA GATCAACTCA TCCCTGCAGC TCCCAGACCIO CGTGCXGAAC 1500 

TTCCTCAAGG ACCACTTCCT GATGGACGGG CAGGTCCGAA GCCGCATGCT GCTGCTGCAG 1560 

CCCCAGGCTC GCTACCAGCG OGTGGCTGTA CACCGCGTCC CTGGCCTGCA CCACACCTAC 1620 

8ATGTCCTCT TCCTGQGCAC TGGTaAGGGC OGGCTCCACA AGGCAGTGAG 0STG6GCCCC 1680 

0GGGT6CAGA TGATIGAGGA QCTGCAOATC TTCTCATC36G GACAGGCCGT GCAOAATCTG 1740 



369 



wo 02/086443 

CTCCTGGACA CCCACAGGGG QCTGCTGTAT GOGGCCTCAC ACTCGGGCGT AGTCCAGGTQ 
OCCaTGGCCA ACTGCAGCCr OTACAGGAGC TGTGGGGACT GCCTCCTCGC CCOGOACCCC 
TACTGTGCrr GGAGCGGCTC CAGCTGC3UW3 CACGTCAGCC TCTACCAGCC 'TCAGCTOGCC 
ACCAGGCCGT GGATCCAGGA CATCGAGGGA GCCAGC3GCCA AGGACCTTTG CAOCGCGTCT 
TCGGTTGTQT CCCCGTCTTT TGTACCAACA GGGGAGAAGC CAT6TGAGCA AGTCCAGTTC 
CAGCCCAACA CAGTGAACAC TTTGGCCTGC CCGCTCCTCT CCAACCTGGC GACCCXSACTC 
1!G6CTA06CA AC6GG6CCCC OGTCAATGCC TCGGCCTCCT GCCACGTGCT ACCCACTGGG 
GACCTGCT6C TGGTGGGCAC CCAACAGCTG GGGGAGTTCC AGTGCTGGTC ACTAGAGGAQ 
GGCTTCCAGC AGCTGffTAGC CAGCTACTGC CCAOAGGTGG TGGAG6A0GG GGTGGCAGAC 
CAAACAGATG AGG6TGGCAG TOTACCCGTC ATTATCAGOl CATOGOGTGT GAGIGCACCA 
GCTGGTGGCA AGGCCAGCTG GGGTGCAGAC AGGTCCTACT GGAAQGAGTT CCTOGTGATQ 
TOOkOGCTCT TTGTGCTGGC COTGCTGCTC CX»GTTTTAT TCTTGCTCTA CCXSGCACOSG 
AACA6CATGA AASTCr rC CT GAA6CAGG0G GAATGTGCCA GC6TGCACCC CAAGACCTGC 
CCTGTGGTGC TQCCCCCTGA GACTCGCCCA CTCAACGGCC TAGGGCCCCC TAGCACCCOG 
CTCGATCACC OAOGGTACCA GTCCCTOTCA GACftGCOCCX: CGGOGTCCOG AGTCTTCACT 
GAGTCAGAGA AQAQGCCACT CAGCATCCAA QACAGCTTCG TGGAfiGTATC CCCAGTGTGC 
CCCSCGGCCCC Q6QXCOC3CCT TGGCTCX3GAG ATCOOTOACT CTQTGGTGTG AGAGCTQACT 
TCCAOAGOAC GCTGCCCTGG CTTCAGGGGC TGTGAATGCT CGQAGA GGGT CAACTGQACC 
TCCCCTCCGC TCTGCTCTTC GTGGAACACX3 ACCOTGGTGC CCXSGCCCTTG GGA6CCTTG0 
GQCCAGCTGG CCTGCTGCTC TCCAGTCAAG TAaCOAAGCT CCTAGCACCC AGACAOCCAA 
ACAGCCGTGG CCCCAGAGGT CCTGGCCAAA TATG0G060C TCCCTAOOTT GGTGQAACAO 
TQCTCCTTAT GTAAACTQAG CCGTTT&TTT AAAAAACAAT TCCAAATGTG AAACTAGAAT 
GAGAGGGAAG AGATAGCATG GCATGCAGCA CACACGGCTG CTCCAGTTCA TGOCCTCCCA 
6GGGTGCTGQ GOATGCATCC AAAGTGGTTG TCTGAGACAQ AGTTGGAAAC CCTCACCAAC 
TGGCCTCTTC ACCTTCCACA TTATCCCSGCT GCCACC3Q8CT GCCCTGTCTC ACTGCA6ATT 
CAQGACCAOC TTGGGCTGCG T6C3GTTCTGC CTTGCCAGTC AGCCGAGGAT GTAGTTGTTG 
CTGCCGTCOT COaCCACCT CAGGGACCAG AQGGCTAG6T TGGCACTGCG GCCCTCACCA 
GGTCCTGGGC TOGGACCCAA CTCCTGOACC TTTCCAGCCT GTATC AGGC T GTGSCCACAC 
GAGAGGACAG CGCGAGCTCA GGAGAGATTT CQTGACAATG TACGCCTTTC CCTCAGAATT 
CAGGGAAGAG ACTGTCGCCT OCCTTCCTCC 6TTQTTG0GT GAGAACCCOT OTGCCCCTTC 
CCACXATATC CACCCTCGCT CCATCTTTQA ACTCAAACAC GAGGAACTAA CT6CACC3CT0 

GTcxrrcTCcx: cagtccccag ttcaccctcc atccctcacc ttcctccact ctaagggaxa 

TCAACACTGC CCAGCACAGG GGCCCTGAAT TTATGTGGTT TTTATACATT TTTTAATAAG 
ATGCACTTTA TGTCATTTTT TAATAAAGTC TGAASAATTA CTGrTT 

Seq ID NOs 479 Protein sequexice 
Protein Accession XP_044533.3 

1 11 21 31 41 51 

I I I I I I 

MZiRTAM6I«BS WLAAFWGALP PRPPLLLLLL LLLLLQPPPP TWAL8FRISL PLGSBBRPFL 

RFBAEHISNY TALHiSRDGR TLYVGAREAL FALSSNLSPL PGGEYQBLLW GADABKKQQC 120 

SPK6KDPQRD CQNYIKILLP LSQSHIiPTOG TAAFSPMCTY INMEWFTLAR DEKGNVLLED 180 

GKGRCPPDPN FKSTALWDG ELYTOTVSSF QGNDPAISRS QSIAPTKTES SLNWLQDPAF 240 

VASAYZPBSXi GSLQODDDKX YFFFSBTGQB FBFFSHTIVS RZARZCRQDE GGERVIX3QRW 300 

TSFIiKAQUiC SRPDDQFPFM VLQDVFTLSP SPQDWROTLP YGVFTSQWHR GTTEGSAVCV 360 

FTMKDVQRVF SGLYKEVNRE TQQWVTVTHP VPTPRPGACI TNSARERKIM SSLQLPDRVI, 420 

NPLKDHFIi© GQVRSRMLLL QPQARYQRVA VHRVPGLHHT YDVLFLGTGD GRLHKAVSVG 480 

PRVHIIEELQ IFSSGQPVQN LUiDTHRGLI* YAASHSGWQ VPMANCSLYR SCGDCLLARD 540 

PYCAWSOSSC KBV6LYQFQL ATRPHIQDIB GASAKDIiCSA SSWSPSPVP TGEKPCEQVQ 600 

FQPOTVMTLA CPLLSNLATR LWLRNGAPVN ASASCHVLPT GDLLLVGTQQ LGEFQCWSLE 660 

EGFQQLVASY CPBWE33GVA DQTDEGGSVP VIISTSRVSA PAGGKASWGA DRSYWKEFLV 720 

MCTLFVLAVL LPVLFLLYRH RNSMKVPLKQ GECASVHPKT CPWLPPBTR PLNGLGPPST 780 
PLDHRGYQSIi SDSPPGSRVF TBSBKRPLSI QDSFVEVSPV CPRPRVRU^S BIRDSW 

Seq ID JSK>i 460 OHA sequence 

IVucleic Acid Accession #> imL004217.1 

Coding sequence: 58.. 1092 

1 11 21 31 41 51 

6GCCGGGAGA GTA6CAGTGC CTTGQACCCC AGCTCTCCTC CCCCTTTCTC TCTAAGGATG 60 

GCCCAGAAGG AGAACTCCTA CCCCTGGCCC TAOGGCOGAC AGACQQCTCC ATCTGOCCTG 120 

AGCACCCTOC CCCAQOGAGT CCTCCX3GAAA GAGCCTGTCA CCCCATCTGC ACTTGTCCTC 180 

ATGAGOXSCT CCAATGTCCA GCCCACAGCT GCCCCTGGCC AGAAGOTGAT GGAGAATAGC 240 

AGTGG6ACAC COOACATCTT AACGCGGCAC TTCACAATTQ ATGACTTTGA GATTGGGCGT 300 

CCTCTCGGCA AAGGCAAGTT TGGAAACGTG TACTTGGCTC GGGAGAAGAA AAGCCATTTC 360 

ATCQTOGCGC TCAAGGTCCT CTTCAAGTCC CAGATAGAGA AGGAGGGOST GGAGCATCAG 420 

CTGOGCAfiAG AQATCGAAAT CCAGGCCCAC CTGCACCATC CCAACATCCT GCGTCTCTAC 480 

AACTATTTTT ATGACCGGAG GAGGATCTAC TTGATPCTAG AGTATGCCCC CCGOGGQQAG 540 

CTCTACAAGG AGCTGCAGAA GAGCT6CACA TTTGAOGAGC AGCGAACAGC CACGATCATG 600 

GAGGAGTTGG CAGATGCTCT AATGTACTGC CATOGGAAGA AGGTGATTCA CAGAGA CATA 660 

AAGCCA6AAA ATCTGCTCTT AGGGCTCAAG GGAOAGCTGA AGATTGCTGA CTT OSGC TGG 720 

TCTGTGCATG CGCCCTCCCT GAGGAGGAAG ACAATGTGTG GCACCCTGGA CTACCT6CCC 780 

CCAGAGATQA TTGAGGGGOG CATGCACAAT GAGAAGGTGG ATCTGTGGT6 CATTGGAGTG 840 

CTTTGCTATQ AQCTGCTGGT GGGGAACCCA CCCTTTQAOA GTGCATCACA CAACGAGACC 900 

TATCGCCGCA TCGTCAAGGT GGACCTAAAG TTCCCOQCTT CTGTGCCCAC GGGAGCCCAG 960 

GACCTCATCT CCAAACTGCT CAGGCATAAC C3CCTCGGAAC GGCTGCCCCT GG CCCAGOT C 1020 

TCAGCCCACC CTTGGGTCOG GGCCAACTCT 0GGAGGGT6C TGCCTCCCTC TGCCCTTCAA 10 80 

TCTGTCGCCT GATGGTCCCT GTCATTCACT 06GGTG0GTG TGTTTGTATO TCTQ TOTAT Q 1140 

TATAGGGGAA AGAAGGGATC CCTAACT6TT CCCTTATCTO TTTTCTACCT CCTCCTTTOT 1200 
TTAATAAAGG CTGAAGCTTT TTGT 

Seq ID NO t 481 Protein sequence 
Protein Accession #: NP_004208 

1 11 21 31 41 51 
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I I 

MAQKENSYFH FYGRQIAFSO 
SSGTFDILTR HFTIDDFEIG 
QLRREIEIQA HLHHPNILRL 
MEELADALMY CKGKKVIHRO 
PPEMIEGRMH NEKVDLHCIG 
QDLISKLLRH NPSERLPLAQ 



till 
I.8TLPQRVI1R KEFVTPSMiV LMSRSNVQPT AAPGQKVHBN 
RPLGKGKFGN VYLAREXKSR FZVALKVLFK SQIBKEGVEH 
yHYFYDRRKI YLILEYAPR6 ELYKELQKSC TFDBQRTATI 
IKPENLLUGh KGELKIADF6 WSVHAPSLRR KTMCGTLDYL 
VLCYELLVGH PPFBSASHNE TYRRIVKVDL KFPASVPTGA 
VSAHPWVRAK SRRVIiPPSAL QSVA 



PCTAJS02/12476 



Seg ID NO: 482 DMA Bequence 
Nucleic Acid Accession #: MQ0SS663 
Coding b equence : 3 8 .. 142 3 



AQAAC6GCTT 

aaaacx:acaa 
ocqaaggtcc 

GCTTTATGrGG 
TTTTGATCTT 
TAGCCCTGTC 
AGTCTTGGCA 
ACAGCXICGAG 
CCTGTTCACG 
TACGAGCTGG 
GGGACTTAGC 
AGCATTT6CT 
CACTGCCTCT 
GTACAGTGGG 
ACTCATCAGA 
GACCCTAGGT 
TGAACAAAT6 
TGTTCAAATT 
CAATGTCCTA 
TGATTTGAAC 
ATTTAACACT 
TTATGGTTTT 
TGGAGTTCCA 
TAGATATGGA 
TTATAAGGAA 
TGTTTAATCA 
TATGAAACTA 
GCTTTAAATA 
GTTTTGTAGT 
AGATCTGTCA 
TGCCACTBTG 
CTTAGTTTTT 
ATGCAGTGGC 
TGCCTCAGCC 
TGTATTTTTA 
ACCTCATGAT 
CCTGGCCGAT 
GGGAAAGQGA 
AATTGCTAAA 
TTTTTAGCAG 
CSATTTTTGTT 



11 

I 

CCGGGGGGAG 
AGATCCTTTT 
TGGAAGATAC 
TGCAGTTCTA 
TTTAGTTTAA 
TATTCATTTG 
CAGTTGGGAG 
ATACACACGG 
AT6CTTTCTA 
CTTCAAGAGC 
AGTATCTTCC 
CTTTGTATTA 
GCTATAGCTA 
AAAGTCTTAC 
GAGGTATCTA 
TTTGGCTCAT 
GTTCTTGCTC 
TTCAAGQATG 
AACTTTTCAG 
CCAOTTACAT 
CCTGGGAAAA 
GGTCTCAATC 
GGAATTGGAO 
ACTAATAATA 
TATTGACTCC 
TTTACTCTAA 
TATTTTTGTA 
GGCTTCCTTT 
TGACTGCAGT 
CATTACTAAG 
C0CG6CCAAT 
Q TTT T 'GT' rrr 
AT6ATCTCAG 
TCCCGAGTAG 
OTAAAGAOQG 
CCACCCACCT 
ATTTTCTTTA 
AAAATGTCTG 
TTTTTCTTTG 
AAATTTTGGA 
AAAGTTTCTC 



21 
I 

CTGTGCAGCT 
TTGGCAAGTT 
TGCTCTTTGG 
CTAATAGTAT 
TGACAT6TTT 
GGTTTGAAAO 
CTCTCTTTAT 
GAAGATTATT 
TTCGGAATAA 
ATGTTGCAGA 
TTCCCCGAAT 
CATATATGCT 
TTGCCTTGAT 
TCCAGACAAC 
0CTTAGATG6 
TGOCTQGATC 

atgtgagcaa 
act6gattag 
atcatcaogt 
caactcx:agc 

ATGTQAACOC 
ATGGACACAC 
CAACTCAAGG 
GAATTGGACA 
TTGGCTTCCA 
ATGTTAGATA 
AAATGTATTT 
AGAAAATGTG 
GTGATGTGAC 
ATAOSATATT 
ACATTATTAT 
QTTTTTTGAQ 
CTCACTGCAA 
CTGGQATTAC 
GG6ATTTCAC 
TAGCXTCGCA 
ATGAAATTTA 
TTCAAAAAGT 
AGGTTCTCCT 
ATACATTCTA 
TCCTTTAAAA 



31 

I 

CCITATCATG 
6TTACGGGAA 
TGTAAXAAAC 
AGCTTTAACT 
AATAAGTTAC 
ATTAGAAGTC 
ATTAAAAGAA 
AGTTGGTACT 
ACCTTTTGCT 
TCTTAGTCGA 
GAATCCATTT 
CATTGAAATT 
GACATTTGGC 
ACCACCCCAT 
AGTTTTAGAA 
J\GTaCATOTA 
CAGGCTGTAC 
GCCTGCCTTA 
AATCCCftATG 
TAAACCTAGT 
A6TTATTCTT 
ACCTTACAGC 
ATTGAGGACT 
ACCAAGACCA 
ATTTATTTAG 
ATAGTAGTCT 
GTGACAGTGA 
TTTCTTTAAA 
CTTACCTTTA 
TCTTTTTTTT 
TAACTTAAGG 
ATGGAGTCTC 
CCTCTGCCTC 
AGGCACCTGC 
CATGTTG6CC 
AAGTGCTGGO 
TAAATATQCT 
AAAGOTCTCT 
GAATTATGTC 
TCTAGCACAA 
ATTTTAGTAC 



41 

I 

GGGACAATTC 
TTTAGACTTG 
TTGATATGTA 
GGCTATACTT 
TGGGTAACAT 
CTGGCTGTAT 
AGTGCAGAAC 
TTTGTGGCTC 
TAT6TCTCA6 
AGCTTGTGTG 
GTTTTQATTG 
AATAATTATT 
ACTATGTATC 
OTTATTGGTC 
GTCC6AAATG 
AQAATTOGAC 
ACTCTAGTGT 
TTGTCTGGGC 
CCTCTTTTAA 
AGTCCACCTC 
CTAAACACAC 
A6CATCCTTA 
GGTTTTACAA 
TGATAGACTC 
TAATCCAACT 
TGTTCACATT 
AATCCTOGTA 
TTTGQATTTT 
TAAGA6CCAC 
TCOGAGAOGG 
CTGTACTTTA 
ACTCTGTCGC 
CTGAGTTCAA 
CACCACGCXX: 
AGGCTGGTCT 
ATTAGGTOTG 
TCTTOAATAA 
TTTATAGCTT 
TTACAAACTA 
TTTGAATTTT 
ATTTGTAAAT 



51 
I 

ATCTCTTTC6 
TAGCAGCTGA 
CTGGCTTCCT 
ACCTGACCAT 
TGAGGAAACC 
TTGCCTCCAC 
GCTTTTTGQA 
TTTGTTTCAA 
AAGCTGCTAO 
GAATTATTCC 
ATCTTGCTGO 
TTGCCGTAGA 
CCATGAGTGT 
AGTTGGACAA 
AACATTTTTG 
GAGA7GCCAA 
CTACTCTAAC 
CTGTTGCAGC 
AGGGTACTGA 
CAGAATTTTC 
AAACAAGGCC 
ATCAA66ACT 
ATATACCAAG 
TAACTTATTT 
TTGCATT6AC 
TCAT6AAA0C 
AATGTTAAAG 
GGTATCTTTG 
TTGATGGAGT 
AGTCTTGCTC 
TTAAQGCTTC 
CCAGGCTGGA 
ATGATTCTCC 
A6CTAATTTT 
T6AACTCCTG 
AGGCACG6CA 
TACACATTTT 
TTCCAAACTT 
AAAGCAAAAA 
TAATTATCAA 



Seq ID NO: 483 Protein sequence 
Protein Accession #1 BAB70980.1 



1 
I 

MGTIHLFRKP 
TAYTYLTIPD 
ESAERFLEQP 
RSLOGIIPGL 
GTMYPMSVYS 
VRZRRDANBQ 
MFIiLKGTDDL 
SSMLNQGLGV 



11 
I 

QRSFFGKLLR 
liFSLHTCLIS 
EIHTGRLLVG 
SSIFtiPRMNP 
GKVLLQTTPP 
HVLAHVTNRL 
NFVTSTPAKP 
PGI6ATQGLR 



21 
I 

EFRLVAADRR 
YHVTIiRKPSP 
TFVALCFNLF 
FVLIDLAGAF 
HVIGQLDKLI 
YTLVSTLTVQ 



TGFTiriPSRy 



31 

1 

SWKILLFGVI 
VYSFOFERLB 
TMI*SIRHKPF 
ALCITYMLIE 
REVSTLDGVL 
IFKDDWIRPA 
TPGKNVNFVI 
GIUmRIGQPR 



41 

I 

NLICTGFUiM 
VLAVFASTVL 
AYVSEAASTS 
INNYPAVDTA 
BVRNEHPWTL 
LLSGPVAANV 
LLNTQTRPYQ 
P 



51 

1 

WCSSTNSIAL 
AQLGALFII.R 
WLQBEVADLS 
SAIAIAIiMTP 
GFGSLAGSVH 
LNFSDHKVIP 
FQUIHGHTPy 



Seq ID KOi 484 DNA sequence 

Nucleic Acid Accession FGQIESH predicted 

Coding sequence: 1..900 



ATGCOGCCGC 



GCCGTGGGCA 

CGGCCCACTG 
GGCTGCGGCG 
GGACCCCGGG 
CTTCCTAACT 
CCGGTGCGCA 
CTTTGCTACC 
TTTCAAAACA 
GTGCTGCTGG 




GACCCCTCCC 



06O80GCTAC 
GCGGCCGCGT 
AGGGCGCAGA 
CCAGQACGCT 
GGATGGAGCT 
ACTTOGTTCC 
GCCCA6CTCC 
CCAGGGGCCT 
AATTCAGCT6 



60 
120 
180 
240 
300 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
B40 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 



60 
120 
180 
240 
300 
360 
420 



60 
120 
160 
240 
300 
360 
420 
480 
540 
600 
660 



371 



wo 02/086443 

GACCAGG66G GCCGGGA6GQ CCCCOTGCCC CAACCCCaGG CTCAGGGTCT GGCCGAQAAG 720 

ATOCGftGCCT GCTGCTACCT TGAGTGCTCA GCCTTGACGC AGAAGAACTT GAAGQAAGTA 780 

TTTGACTOGG CTATTCTCAG TGCCATTGAG CACAAAGCCC GGCTQGAGAA QAAACTGAAT 840 
GCC3UVAG6TG TGCGCACCCT CTCCCGCTGC OGCTGGAAOA AGTTCTTCTG CTTCOTTTGA 

Seq ID NOt 485 Protein sequence 
Protein AcceBsion fti FOENESH predicted 

1 11 21 31 41 51 

I I I I I i 

KPPRELSEAE PPPLRAPTPP PRRRSAPPEL GIKCVLVGDG AVGKSSLIVS YTCaiaYPARy 60 

RPTALDTPSG TYVQSPVRPR GCGGAVHRGA GAGVSAGGRR GPRGGDWSRP RGGAGAAQDA 120 

LPNSQSPRPA PAVQVLVDGA PVRIELWDTA GQEDFDRLRS LCYPDTDVFL ACFSWQPSS 180 

FQNITEKHLP EZRTHNPQAP VLLVGTQADL RDDVNVLIQIi DQGGREGPVP Q PQAQGL ABK 240 
ZRACCYLECS ALTORNItXEV FDSAIttSAXE KKARI£KKUI AXGVRTLSRC RWXKFFCFV 

Seq ID NO: 486 DNA sequence 

Nucleic Acid Accession #: XMJ)63832.2 

Coding sequence: 1..711 

1 11 21 31 41 51 

1)1111 

ATGCCGCCX3C GQGAGCTGAG 0GAGGCCX5AG CX3GCCCCCGC TCCGGGCCCC GACCCCTCCC 60 

CCGCGGCGGC GTAQCGCX3CC CCCAGAGCTG GGCATCAAGT GOGTGCTGGT GGGCGAOGGC 120 

GCXSQTQGGCA AGAGCA60CT CSaOjTCAGC TACACCTGCA ATGGGTACCC OQOSCGCTAC 180 

OGOCCCACTO OOCTGGACAC CTTCTCTOTG CAAGTCCTGG TGGAT GGAG C TOOGOTOOQC 240 

ATTQAGCrCT GGGACaCAGC GGGACRGGAG GATTTTGACC GACTTCOTTC CCTT TGCTAC 300 

COQGATACOG ATGTCTTCCT G60GTGCTTC AGCGTGGTGC AGCCCAGCTC CTTTCAAAAC 360 

ATCACAGAGA AATGGCTGCC 0GAGATCC5GC ACX3CACAACC CCCAGGOQCC TGTGCTGCTG 420 

GTGGGCACOC A6GC06ACCT GAG6GA0GAT GTCAAOSTAC TAATTCAGCT GGA0CA66GG 480 

GGCCGGGAGG GCCCCGTGCC CCAACCCCAG GCTC3W3GGTC TGGCCGAGAA G ATOO GAGCC 540 

TGCTGCTACC TTGAGTGCTC AGCCTTGACG CAGAAQAACT TOAAGGAAGT ATTTGACTGG 600 

GCTATTCTCA GTX3CCATTGA GCACAAAGCC CGGCTGQAGA AGAAACTGAA TGOCAAAGGT 660 
GTGCGCACCC TCTCCCGCTG CCGCTGGAAO AAGTTCTTCT GCTT06TTTG A 

Seq ID NOt 487 Protein sequence 
Protein Accession fti XP_063832.1 

1 11 21 31 41 51 

I I I I 1 i 

HPPRBLSEAB PPPIiRAPTPP PRRRSAPPEIi GIKCVLVGDG AVOKSSLZVS YTCNGyPARY 60 

RPTALDTPSV QVLVDGAPVR lELWDTAGQE DPDRLRSLCY PDTDVPLACP SWQPSSFCBi 120 

ITEKWLPEIR THNPQAPVLL VGTQADLRDD VNVLIQLDQG GRBGPVPQPQ AQGLAEKIRA 180 
CCHiECSALT QKNLKEVPDS AIIiSAIEHKA RLEKKUJAKQ VRTIiSRCaiWK KFFCPV 

Seq ID NOt 488 ONA sequence 

Nucleic Acid Accession #t NM_014398.1 

Coding sequence i 64 . . 1314 " 

1 11 21 31 41 51 

I I I i I 1 

GGCACXrGATT CGGGGCCTGC CCGQACTTOG CCGCACGCTG CAGAACCTCX3 CCCA GOGCCC 60 

ACCATGCCCC GGCAGCTCAG CGCGGCGGOC GCGCTCTTC3G C3GTCCCTGGC OGTAATTTTG 120 

CACGATGGCA GTCAAATGAG AGCAAAAGCA TTTCCAGAAA CCAGAGATTA TTCTCAACCT 180 

ACTOCAGCAG CAACAGTACA GGACATAAAA AAACCTGTCC AGCAACCAGC TAAGCAAGCA 240 

CCTCACCAAA CTTTAGCAGC AAGATTCATG GATGGTCATA TCACCTTTCA AACAGCGGCC 300 

ACASTAAAAA TTCCAACAAC TACCCCAGCA ACTACAAAAA ACACTGCAAC CACCAGCCCA 360 

ATTACCTACA CCCTGGTCAC AACCCAGGCC ACACCCAACA ACTCACACAC AGCICCTCCA 420 

GTTACTGAAG TTACAGTCGG CCCTAGCTTA GCCCCTTATT CACTGCCACC C»CCATCACC 480 

CCACCAGCTC ATACAQCTGG AACCAGTTCA TCAACOGTCA GCCACACAAC TGGGAACACC 540 

ACTCAACCCA GTAACCAGAC CACCCTTCCA GCAACTTTAT CXSATAGCACT GCACAAAAGC 600 

ACAACCGGTC AGAAGCCTQA TCAACCCACC CATGCCCCAG GAACAACGGC AGCTGCCCAC 660 

AATACCACCC GCACAGCTGC ACCTGCCTCC ACGGTTCXrTG GGCOCACCCT TGCACCTCAG 720 

CCATOGTCAG TCAAGACTGG AATTTATCAG GTTCTAAAOG GAAGCA6ACT CTGTATAAAA 780 

GCAGAGATGG GGATACAGCT GATTGTTCAA GACAAGGAGT CQGTTTTTTC ACCTCGGAGA 840 

TACTTCAACA TCGACCCCAA CGCAACGCAA GCCTCTGGGA ACTGTGGCAC CCGAAAATCC 900 

AACCTTCTGT TGAATTTTCA GGGCGGATTT GT6AATCTCA CATTTACCAA GGATGAAGAA 960 

TCATATTATA TCAGTGAAGT GGGAGCCTAT TT6A0C0TCT CAQATCOUSA QACAGnTAC 1020 

CAAGGAATCA AACATGCGGT GGTGATGTTC CAGACA6CA0 TCGGGCATTC CTTCAAGTGC 1080 

GTGAGTGAAC AGAGCCTCCA GTTGTCAGCC CACCTGCAGG TGAAAACAAC 06ATGTCCAA 1140 

CTTCAAGCCT TTOATTTTGA AGATOACCAC TTTGGAAATG TGGATGAGTG CTCGTCTGAC 1200 

TACACAATT6 TGCTTCCTGT GATTGGGGCC ATOGTG6TT6 GTCTCTGCCT TATGGG TATG 1260 

aGTGTCTATA AAATCOQCCT AAGGTOTCaUl TCATCTG6AT ACCAGAGAAT CTAATTGTTG 1320 

CCOGGQ06GA ATGAAAATAA TGGAATTTAG AGAACTCTTT CATCCCTTCC AGGATGGATG 1380 

TTGGGAAATT CCCTCAGAGT GTGGGTCCTT CAAACAATGT AAACCACCAT CTTCTATTCA 1440 

AATGAAQTGA GTCATGTGTG ATTTAAGTTC AGGCAGCACA TCAATTTCTA AATACTTTTT 1500 

GTTTATTTTA TGAAAGATAT AGTGAGCTGT TTATTTTCTA GTTTCCTTTA GAATATTTTA 1560 

GCCACTCAAA 6TCAACATTT QAGATATOTT 6AATTAACAT AAT ATATOTA AAGTAGAATA 1620 

AGCCTTCAAA TTATAAACCA AGGGTCAATT GTAACTAATA CTACTGTGTG TGCATT6AAG 1680 

ATTTTATTTT ACCCTTGATC TTAACAAAOC CTTT6CTTTG TTATCAAATG GACTTTCAGT 1740 

GCTTTTACTA TCTGTGTTTT ATGGTTTCAT QTAACATACA TATrCCTGGT GTAGCACTTA 1800 

ACTCCTTTTC CACTTTAAAT TTOTTTTTGT TTTTTGA6AC GGAGTTTCAC TCTTGTCACC 1860 

CAGGCTGGAG TACAGTG6CA OGATCTCGGC TTATGGCAAC CTCOGCCTCC CGGGTTCAAG 1920 

TGATTCTCCT OCTTCAGCTT CCOGAGTAGC TGGGATTACA GGCACACACT ACCACX3CCTG 1980 

GCTAATTTTT GTATTTTTAT TATAGAOGQG TTTCACCATG TT6QCCAGAC TGGTCTTGAA 2040 

CTCTTGACCT CAGGTQATCC ACCCACCTCA GCCTCCCAAA GTGCTGGGAT TACAGGCATG 2100 

AGCX31TTGCG CCCGGCCTTA AATGTTTTTT TTAATCATCA AAAAQAACAA CATATCTCJkG 2160 
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WO 02/086443 

6TTGTCTAAG TGTTTTTATG 
CTTGATGACT CCTGCTCCAG 
CTAAACAATA AGCAAGAOAC 
TAGGCTAAGC ACTTTATCTA 
ACTGAGACrr AAGGOAACTG 
QAGCTTOAAT TCATGTTGGT 
CCTACAAGAA CAATGACACC 
TCACCTTACA GGGAAATGGG 
AGCTTTGCAG ATAACAAAAT 
TGAGGGGCTT TGTAAAACAT 
GTAAAGATGA AGGCATCAAA 
ACTTCGCTAA CCAACTGTTC 
TTCTGCACTT CATATCCATA 
GAATTTTATT TCTGCTGTTT 
AGAAAAGTCC ACATAACCCT 
CCATGTT6AC TTTCCTCATG 
TAGTCTTAAT AAAACATTGA 



TAAAACCAAC 
AATTGCTAGA 
AATAATAATG 
TATCTCATTT 
AATCACTTAA 
CTGACATCAA 
ACACTCTGCC 
TTTATCCAGG 
AGCCTAtCCT 
TAGTCA6TTG 
TAAACTCAAA 
TTTCTTGAGT 
TTTCCTATTQ 



A6AATTCTTA 
TGTTTCCTTA 
ATTGTAGTAA 



AAAAAGAACA 
CTAAOAATTA 
GCCCTTAATT 
CATTCTCACA 
ATGTCACCTG 
GGTCTTTGGT 
TGAAGGCTCA 
ATCATGAGAC 
TAATAAATCC 
CTCATTTTTA 
GTATTTTTAA 
OTATAGCCCC 
TTCACTTTAT 
TAAAGAAAGG 
GTCAAGGAAT 
7GACTCAGTA 
AGGTTTTTQC 



AATCA6CTTA 
6GTG6CTACA 
ATTAAC3UUU3 
ACTTATAA6T 
GCTAACTGAT 
CTTCTCCCTA 
CACCTCATAC 
ATTAGGGTAG 
TCCACTCTCT 
TGGGATTGCT 
ATTTTTTTGA 
ATCTTGTGGT 
TCTGTAGAGC 
AACTAAGTCA 
AATTCAAGTC 
AGTTG6CAA6 
AATAAAAACT 



TATTTTTTAT 
GATGOTAOAA 
TGOCAGASrC 
GAATGAGTAA 
GGCAGAGCCA 
CACCAAGTTA 
CAGCATACGC 
ATGAAA6GAG 
GGAAGGAGAC 
TAGCTGGGCT 
TAATAGAGAA 
AACTTGCTGC 
AGCCTGCCAA 
GGATGTTAAC 
AGCCTA6AGA 
GT0CT6ACTT 
TACTTTQQ 



Seq ID NO: 489 Protein sequence 
Protein Accession «; HP 05S213.1 



1 

I 

MPRQLSAAAA 
HQTLAARFMD 
TEVTVGPSItA 
TGQKPDQPTH 
EMGIQLIVQD 
YYISEVGAYL 
QAFDFEDDHF 



11 
I 

LFASLAVILH 
CTITPQTAAT 
PYSIiPPTITP 
APGTTAAAHN 
KESVPSPRRY 
TVSDPETVYQ 
GNVDECSSDY 



21 

! 

DGSQMRAKAF 
VKIPTTTPAT 
PAHTAGTSSS 
TTRTAAPAST 
FNIDPNATQA 
GIKHAWMFQ 
TIVLPVIGAl 



31 

1 

PETRDYSQPT 
TXNTATTSPI 
TVSHTTGNTT 
VPGPTLAPQP 
SGNCXrrRKSN 
TAVGHSPKCV 
WGLCLNQM6 



41 
1 

AAATVQOIKK 
TYTLVTTQAT 
QPSNQTTLPA 
SSVKTGIYQV 
LLLHFQGGFV 
SEQSXiQLSAH 
VYKIRLROOS 



51 
I 

PVQQPAKOAP 
PNNSHTAPPV 
TLSIAUIKST 
LNGSRIiCIKA 
NLTFTKDEES 
LQVKTTDVQL 
SQYQRZ 



Seq ID NO: 490 DNA sequence 

Nucleic Acid Accession #z NN.005409.3 

Coding sequence: 94.. 3 76 



1 
i 

TTCCTTTCAr 
CaACAGCACC 
TTGGCTGTQA 
TGTCTTTGCA 
ATAATGTACC 
AAAGGftCAAC 
GAAAGAAAGA 
AACCTAGAAC 
AGACTTTTCT 
GGGTGAAAGG 
CT6CCCAAAG 
GGTTACCATC 
GCATTTCTAG 
6AGAACATTT 
CTGT6GTTAC 
CATCTATGTG 
CCAAATATCA 
TTTATAACCA 
TGGGATACTG 
GATGTTTTTC 
TGTACTTTTT 
TACAAAATGT 
AATCACTTTT 
TTGTTCATGC 
GTCATTTTTT 



11 

1 

GTTCAGCATT 
AGCAGCAACA 
TATTGTGTGC 
TAGGCCCTGG 
CAAGTAACAA 
GATGCCTAAA 
ATTTTTAAAA 
AAGTTTAACT 
ATGGTTTTQT 
ACCAAAAACA 
6AGTCCAGCA 
GGAGTTTACA 
GCTAGAGAAC 
CTGTCTCTAG 
AGTGGAGACA 
TG8TAAAGCA 
TGTAGCACAT 
ATTCATTAAA 
6CAACAGTGC 
AACTTTTATT 
QTTTTQATCC 
TTTTGTCTAC 
ACTTTTTGTA 
CTATATACTG 
TCTCTAATAA 



21 
I 

TCTACTCCTT 
GCAAAAAACA 
TACAGTTGTT 
GGTAAAAGCA 
CTGTGACAAA 
TCCCAAAT06 
ATATCAAAAC 
GTGACTACTG 
GACTTTCAAC 
GAAATACAGT 
ATTAAATOGA 
AAtSTQCTTTC 
CTTCTAGATT 
AAGTTATCTG 
TTGACATTAT 
TTCCTCAAAC 
CAATATGTAG 
TGTAATTCAT 
ACATATTTCA 
CATTGAGATG 
6TTT6TATAA 
CAAAOAAAAA 
ATTCTGTCTC 
TAAAATTTA6 
ACTACCACAA 



31 

I 

CCAAGAAGAG 
AACATGAGTG 
CAAGGCTTCC 
GTGAAAGTGG 
ATAOAAGTGA 
AAOCAAGCAA 
ATATGAA6TC 
AAATGACAAG 
TTTTGTACA6 
CTTCCTGAAT 
TTTCTAGGAA 
ACGTTCTTAC 
T6ATGCTTAC 
TCTGTATTGA 
TACTGGAGTC 
ATTTTTTCAT 
G8AAACATTC 
AAAAT6TACT 
TAACCAAATT 
TTTTGAAGCA 
ATGATA6CAA 
TGTTGAAAAA 
TTA6AAAAAT 
GTATACTCAA 
CCTTTCTTTT 



41 

I 

CAGCAAAGCT 
TGAAGGGCAT 
CCATGTTCAA 
CAGATATTGA 
TTATTACCCT 
GGCTTATAAT 
CTGGAAAAGG 
AATTCTACAG 
TTAT6TGAAG 
GAATGACAAT 
AAGCTACCTT 
TTOTTGTATT 
AACTATTCTG 
TCTTTATGCT 
AAGOCCTTAT 
6CAAATACAC 
TTATGCATGA 
ATGAAAAAAA 
AGCAGCACOG 
ATTAGGATAT 
TATCTTOGAC 
TAAGCAAAT6 
ACATAATCTA 
GACTAGTTTA 
TTAAAAAAAA 



51 

1 

GAAGTA6CAG 
GGCTATAGOC 
AAGAGGACGC 
GAAAGCCTCC 
GAAAGAAAAT 
CAAAAAAGTT 
GCATCTGAAA 
TA6GAAACTG 
GATGAAAGGT 
CAGAATTCCA 
AAGAAAOGCT 
ATACATTCAT 
TTGTGACTAT 
ATATTACTAT 
AAGTCAAAAG 
ACTTCTTTCC 
TTTGGTTTGT 
TTATACGCTA 
GTCTTAATTT 
GTGTGTTTAC 
ACATTTGAAA 
TATACCTAOC 
ATCAATTTCT 
AAGAATCAAA 



Seq ID NO: 491 Protein sequence 
Protein Accession #: NP_o 0540 0.1 

1 11 21 



41 



51 
I 



MSVKGKAIAL AVILCATWQ GPFMFRR6RC LCZGPGVKAV KVADIEKASI MYP5NNCDKI 
EVIITLKENK GQRCLNPKSX QARLIIKKVE RKNF 

Seq ID NO: 492 DMA sequence 

Nucleic Acid Accession #: NM_000577.1 

Coding sequence: 41.. 520 



I 

1 

GGCACGAGGG 
CCGACCCTCT 
GAAGACCTTC 
CAATTTAGAA 
CCATGGAGGO 
GGAGGCAGTT 
CATCCGCTCA 
CCTCTGCACA 
0GTCAT66TC 
TCCCATTCTT 



11 
I 

GAAQACCTCC 
GGGAGAAAAT 
TATCTGAGGA 
GAAAAGATAG 
AAGATGTGCC 
AACATCACTG 
GACAGTGGCC 
6CGATGGAAG 
ACCAAATTCT 
6GATGGCAAG 



21 
I 

TQTCCTATCA 
CCAGCAAGAT 
ACAACCAACr 
AT6TGGXACC 
TGTCCTQTQT 
ACCTQAGOGA 
CCACCACCAQ 
CTGACCAGCC 
ACTTCCAG6A 
OACTGCAGGO 



31 

1 

G6CCCTCCCC 
GCAAGCCTTC 
AGTTGCCGGA 
CATTGAOCCT 
CAAGTCTGOT 
GAACAGAAAG 
TTTTGAGTCT 
CGTCAGCCTC 
GGAC6AGTAG 
ACTGCCAGTC 



41 

1 

ATGGCTTTAG 
AGAATCTGGG 
TACTTGCAAG 
CAT6CTCTQT 
GAT6A0ACCA 
CAGGACAAGC 
GCCX3CCTGCC 
ACCAATATGC 
TACTGCCCAG 
CCCCTGCCCC 



51 

1 

AGACGATCTQ 
ATGTTAACCA 
GACCAAAT6T 
TCTTGGGAAT 
GACTCCAGCT 
GCTTCGCCTT 
CCGGTTGGTT 
CTGACGAAGG 
6CCTGCCT0T 
A666CTCC0G 



2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2680 
2940 
3000 
3060 
3120 
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60 
120 
180 
240 
300 
360 
420 
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660 
720 
780 
640 
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1020 
1080 
1140 

i2oa 

1260 
1320 
1380 
1440 



60 



60 
120 
180 
• 240 
300 
360 
420 
480 
540 
600 



373 



10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



WO 02/086443 

GCTATGGG6G CACTGAGOAC CAGCCATTGA GGQGTQQACC 
CCTGOTCAQl GGACTCTGOC TCCTCTTCAA CTGACCAGCC 
GTCTTTCTAA TGT6TGAATC ASAGCACAGC ASCCCCTGCA 
TCTGCATTCA GGATCAAACC CCGACCACCT GCCCAACCTG 
CTTCCTCCCT CATTCCACCT TCCCATGCCC TGGATCCATC 
ACCAA6TG6C TCCCACACCC T6TTTTACAA AAAAGAAAAO 
TTTAAGGGTT TOTGOAAAAT GAAAATTA6G ATTTCATGAT 
QAAGGAGAGC CCTTCATTT6 GAGATTATCn' TCTTTOGGOG 
ATTCCT6CAT TTGTGAAATG ATGGTGAAAG TAAGTGGTAG 
TTTTTTTGTG ATGTCCCAAC TTGTAAAAAT TAAAAGTTAT 
ATTrrTTTTT TCCTTTTAAA ACACTTCCAT AATCTGGACT 
CCCAGCCTCC AAGCTCCATC TCCACTCCAG ATTTTTTACA 
TCCTATCAGA AGTTTCTCAG CTCCCAAGOC TCTGAGCAAA 
TCTTCCTCTG CTGAAGGAAT AAATTGCTCC TTGACATTGT 
ACTTGTATQA AAGATGGCTG TGCCTCTQCC TGTCTCCCCC 
GAGCA6GAAA CATQACTCGT ATATGTCTCA GGTCCCTGCA 
CTCTTGGCAG GTACTCAGCG AATGAATGCT GTATATGTTG 
CTGTGACTTC AGCTCTGTTT TACAATAAAA TCTTGAAAAT 
AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAA 

Seg ID NO: 493 Protein sequence 
Protein Accession #: NP_000568.1 



PCTAJS02/12476 



CTCAGAA6GC 
TCCATGCTGC 
CAAAGCCCTT 
CTCTCCTCTT 
AGGCCACTTG 
ACCAGTCCAT 

rrrm ' T ' i ' T ' i ' 

AGAGGCTGAG 
CTTTTCCCTT 
GGTACTATGT 
CCTCTGTCCA 
GCTGCCTGCA 
TGTGGCTCCT 
AGAGCTTCTG 
ACCAGGCTOG 
OGGCCAAGCA 
GGTGCAAAGT 
GGCTAAAAAA 



GTCACAACAA 
CTCCAGAAT6 
CCATGTOGCC 
GCCACTGCCT 
ATGACCCCCA 
GAGGGAGGTT 
CAGTCCCCGT 
QACTTAAAAT 
CTTTTTCTTC 
TAGCCCCATA 
GGCaCTGCTG 
GTACTTTACC 
GGGGGTTCTT 
GCACTTGGA6 
GAGCTCTGCA 
CCTAGCCTCG 
TCCCTACTTC 
AAAAAAAAAA 



51 



GGCTQCGOGA 
G6CTTTTGTT 
CGGGACCTTO 
AOAGGOGOQG 
GCTGGTGGCT 
GCAAGAGCX36 
GCGACGTGCC 
CCTGCTGCAC 
CCGOGCTCOG 
TOQATGACCA 
CCGGCQCCTT 
AGCTGCXSCCT 
GGGCCOOCCT 
ACTACCTGGA 
GAGA6CTGC96 
TG6G0GTGGC 
0GA6AGCTGT 
CCT6CCCTQA 
ACGC06AGTG 
CATOQGGTQfT 
CCCTCCAGGA 
AGGTCAACCC 
GGGA6A6GCC 
GGQACGTCCA 
TGAGCACTQC 
AGGTCATGG6 

ccaagccgga 
tgcgcagcxk: 
gctcgggcag 
octccagctc 
aoaagacctc 

TCCTGGCCCT 
GAGGCCAAGG 
TGGAGAGGCC 
GTCCCAGCCC 
CAGGTCAGCT 
TCCGGCTGCC 
CTACAGAGGA 
CGCCTCCTOC 
CCTGCAGA6A 
TCTQRGATGA 
CTGCGCCCTT 
GOAGTCTGAO 
GGGGCCACTG 
GGAGGCAGCG 

GCcrrcciGG 

CCTGCTCCCA 
CAGG6CTCAG 
CCC0GCACT6 
GCACGGGGAC 
TGTCCTTGTT 
CACCTTGGAC 



11 
I 

GCX3A6GGTTC 
GTCTCCX3CCT 
GCTCTGCCCT 
GOGGGTGGCC 
GCTATGTGCXv 
6A6CTGCGGC 
CCAGG06GAG 
CAGOGAGATG 
GGACAGCAGC 
CTTCCAGCAC 
06GAGA6CTG 
GTACTACCGC 
GCTCGAGCGC 
CTGCCIGGGC 
CCTGOG GGCC 
CAGOGAOGTO 
CATGAAGCT6 
CTATTGCCGA 
QAGQAACCTC 
GGAQAOTGTC 
CAACA6GGAC 
CCAG6GCCCT 
ACXTTTCAGGC 
GGACTTCTGG 
CAGTGATGAC 
TQAOSGCCTG 
CATGACCATC 
CTACAACGGC 
CGGTGATGGC 
CGGGACX;CCC 
GGCTGCCAGC 
TACAGTAGCC 
ACTGACTTTG 
TGGGGTGGGA 
CAGGCCTGGC 
GGGAGCCAGT 
TAGCCCTCCC 
GGCXTCAAAG 
CACTGG6ACT 
A6CCCCGCAC 
TGCATQATGC 
QAGGGGCCCC 
QACTQTOCTC 
ACCCACCTGC 
TGGGCTCTGC 
GGTCCAGGGC 
TCCTCACCCA 
AGTGACGCTC 
CACACGG6AA 
CTGGATAGTT 
CATGGAGAGC 
CCT6GTGACC 



21 
I 

GGACCTOGCA 
CCTCXSGCOGC 
TOGCGGGCGG 
GGGGGCGCOG 
GCOGCAOOGC 
GAGGTCOGCC 
ATCrCGGGTG 
GAGGAGAACC 
OGCGTCCTGC 
CTGCIQAAGQ 
TACAGQCAOA 
GGTGCCAACC 
CTCTTCAAGC 
AAGCAG6C06 
ACCCGTGCCT 
GT0CG6AAAG 
GTCTACTGTG 
AATGTGCTCA 
CTGGACTCCA 
ATCGGCAGOG 
ACGCTCAGG6 
GGGCCTGAGG 
ACGCTGGAGA 
ATCAGCCTCC 
CGCTGCTGGA 
GCCAACCAGA 
CQGCAGCAGA 
AACGAGGTGG 
TGTCTGGATG 
TTGACCCATG 
TGCCCCCAGC 
AGGCCCOGGT 
CCftAAAATAC 
CAGGGA6GGC 
CTCGCCTGCC 
GTGCCCAAAA 
CCCAGCTCCC 
CAAOCCGCIG 
CCCAGCAGAG 
GGGCTGTCTG 
CCTCCCCTCA 
AGOGTCTGCA 
CCACAGAOCC 
GCTTCTGCTG 
CAATGTGGGC 
TGTTGGAGGA 
GA7CAGGAAC 
GGCTGTCACC 
TGCCTAGGTC 
AAGGGCTTTT 
TGTTOGCTCX 
TCCTGTCACT 



31 
I 

OCCO60G0GC 
CX3CCGCCTCT 
GAACTGOGCA 
COGGCC0CX3C 
TGGTOBCCTG 

agatctaggg 
agcacctgcg 
t66ccaaccx3 
agg(xatgct 

ACTOQGAGOG 
ACQGQAGGGC 
TGCACCTGGA 
AGCTGCACCC 
AG6C3GCIGCG 
TOGTGGCTGC 
TGGCTCASGT 
CTCACTQCCT 
AGGGCTGCCT 
TGGTGCTCAT 
TGCACAOGTG 
CC3JIGGTCAT 
AGAAGCX3GCG 
AGCTG6TCTC 
CAGGGACACT 
ACGGGATGGC 
TCAACAACCC 
TCAT6CAGCT 
ACTTCCAGGA 
AOCTCTGOGG 
CCXrrCCCAGG 
CCXXX3ACCTT 
GGOGGTAACT 
AACACAGAOO 
0G60GGCTCT 
TTTCTGCCTT 
GCCATGTATT 
TGCACGGCCO 
GA6CCCACA6 
CGCACCAGCC 
GGTGTCCGCC 
GCGCAGGCTG 
GGGTGAOGCC 
T6CAGTGAGG 
GAGQAGG6GA 
TGCCOCTCGC 
CCCCGAGGGC 
CA6G6CCTCC 
T6CTCACA66 
CCTTOCOQAC 
CXAAACATGC 
TCCCAGATGG 
CACTGAGGCC 



41 

I 

CXX3G06CCGC 
GGACOGCGAG 
GGACCCGQCC 
CATGGAGCrrC 
G6CCC00GGG 
AGCCAAGGGC 
GATCTGTCCC 
CAGCCATGCC 
TGCCACCCAG 
GAOGCTGCAG 
CTTCOGGGAC 
GGAGACGCTG 
CCAGCTGCTG 
GCCCTTOGGG 
TOGCTCCTTT 
GCCCCTGGGC 
GGGAGTCCCC 
TGCCAACCA6 
CACCGACAA6 
GCTGGCGGAG 
CCAGG6CTGC 
CCGGG6CAA0 
TGAAGCCAAG 
GTGCAGTGAO 
CAGAGGCOSG 
06A6GTGGAG 
GAAGATCAIG 
CG0CA6TQAC 
CCGGAAGGTC 
CCTGTCAGAG 
CCTCCTGCCC 
GCCCCAAGGC 
ATA7TTAATT 
GA6CAGGG6C 
TTAATTTTGT 
TCAGQGACCT 
CAGAAGCAGC 
OSAGCCTGIG 
AGCCCT6GCC 
ATCXAGGGTC 
CAGAGCCOGG 
TGAGACAGCA 
G6CXXTCCAT 
A0CTGG6CCC 
ACACA6GGCT 
TGAGGAGCAG 
CTGTTCAGGG 
GAT6CT0GTG 
CCAGCCAGCT 
ATCCATTTAC 
CTTCGGAGGC 
ATCAGQGCCC 



51 
I 

G6CCGC0G0C 
CCGGGCGCGC 
AQGA'TCCQAG 
GGGGCGCX3AQ 
GACGOGQCCA 
TTCA6CCTGA 
CAGGGCTACA 
GAGCTGGAGA 
CTGCGCAGCT 
GOCACCTTCC 
CTGTACTCAG 
GCCGAGTTCT 
CTGCCTQATQ 
GAGGCCCOGA 
GTGCAGGGCC 
CCGGAGTGCT 
GGCGCCAGGC 
GCCGACCTGG 
TTCTGGGGTA 
GCCATCAACG 
GGGAACCCCA 
CTGGCCCGGC 
GCCCA6CTCC 
AAGATGGCCG 
TACCTCCCCG 
GTGGACATCA 
ACCAACCG6C 
QACGOCAGCG 
AGCAG6AA6A 
CAGGAA6GAC 
CTCCTCCTCT 
CCCA6GGACA 
CACCTCAGCC 
AG606CAGAG 
ATGAGGTCCT 
CAGGGGCACC 
CCCTC6AGGC 
CCTTCCTTCC 
CACCCCCCAG 
TGGCAGAGCC 
CCCCACCTCC 
CCACTGCTGA 
GGGCAGATGA 
AAA60CCCAG 
CACAGGGCA6 
CCAGGACCOG 
TGACACAGGT 
GCTGGTGAGA 
GCACT6CAGG 
TGACACTTCC 
CCX3CAGGGCC 
TOCCCCAGGC 



660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
12O0 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 



1 11 21 31 41 

1 1 I I.I I . • 

KAIiETICRPS GRKSSXKOAF RZWDVNQKTP YIiRNNQLVAG YLOOFlIVIIIiE EKIPVVPIKP 
EAIiFLGIHGG KMCLSCVKSG OBTRLQXiBAV NXTDIiSBnUC QDKRFAFZRS DS6PTTSFBS 
AACPGWPLCT AMEADQFVSL TOMPDBGVMV TKPYFQEDE 

Seq ID KO: 494 DEIA sequence 

KUcleic Acid Accession #t im__002081.1 

Coding sequence: 222.. 1898 



60 
120 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2260 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 



374 



10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



WO 02/086443 

CTGGACGGGC CCTCCTTCCC TCCTGTQCCC CAGCTGCCAG 
TGTGGTGTTG GGAAQGGGTC CTQO^GGGGG AGGAGGACTT 
TCCTGAACOS ACTOACCCTG AGQAGGCCGC TTAGTGCTGC 
CGCACAGTGG ACGGAGGTCC CCGGTTGCTG GTCAGGTCCC 
CT6ACTTTA6 ATGTTTT6GG ATCAGGAGCC CCCAACACAG 
CX:CTGCCAGT QCCAGGGTGG GCTGGGGACT CTGGCACAGT 
CAGCACTCCC 6CTGCACACA GACGGCCTAG GGGTGGGGCT 
TCTCTGGAAG GGGCAGCCCT GAGTGGTCAC TGGTCAGGGC 
CCTTCCTCCA CAAG6TCCCC CCACCGCTCA GTGTCAGCGG 
TCCITGTATO KKSMANGGC TGQAAACCTA AA 

Seg ZD NO; 495 Protein sequence 
Protein Accession #: NP 002072.1 



1 
I 

MELRARGWWL 
ICPQGYTCCT 
TLQATFPGAF 
QIiLLPDDyLD 
PLGPBCSRAV 
TDKFWGTSGV 
RGKLAPRERP 
RGRYLPEVMG 
ASDDGS6SGS 



I 

LCAAAALVAC 
SEMEBNLANR 
6ELYTQNASA 
CLQKQABAIiR 
MKLVYCAHCL 
BSVIGSVHW 
PSGTLEKLVS 
DGLANQINNP 



21 

t 

ARGDPASKSR 
SHAELETALR 
FRDbYSELRL 
PFQBAPREUl 
GVPGARPCPD 
LAEAINAliQD 
EAKAQLRDVQ 
EVEVDITKPD 



31 
I 

80GEVRQIYG 
DSSRVLQAML 
yVRGAmJOiE 
LRATRAFVAA 
YCRNVLKGO. 
NRDTLTAKVI 
DFWISLPQTIj 



GTGGCCCTGG 
GGAGGGTCTG 
TTTGCTTTTC 
CATGGCTT6T 
GCAAGTCCAC 
GATGCC66GC 
CAGACCCCAC 
AGTG6CCAAG 
6T6A0GTGT6 



41 

I 

AKGFSLSDVP 
ATQIiRSFDDH 
ETIiAEFWARL 
RSFVQGLOVA 
ANQADLDAEW 
QGOGNPKVNP 
CSEKMALSTA 
KIMTNRLRSA 



Seq ID KO: 496 DKA sequence 

Nucleic Acid Accession # : MM_001650-.2 

Coding sequence: 40.1011 



1 

1 

G60GCAGGCA 
-AGGCGGTGGG 
GGGGTCTGGA 
TTTQTTCTCC 
GTCGACATG6 
TTTGGCCATA 
AGGAAGATCA 
ATTG6AGCAG 
ACCATGGTTC 
TTTCAATTGG 
TCAATAGCTT 
ACTGGTGCCA 
GAAAACCATT 
TATGAGTAT6 
AAAGCTGCCC 
CaOAGGGATG 
GAG6AGAAGA 
C6CACTGAAA 
6AAACAGATT 
GTCTAAACAA 
TGCAAATCTA. 
TCTAGTTACC 
OCTGACAGAA 
AOTCAATTCT 



11 
1 

ATGAGAGCTG 
GTAAGTGTGG 
CTCAAGCTTT 
TCAGCCTGG6 
TTCTCATCTC 

tcagcggtgg 
gcatcgccaa 
gaatcx:tcta 
atgqaaatct 

TGTTTACTAT 
TA6CAATTGG 
GCATGAATCC 
GGATATATTG 
TCTTCTGTCC 
AGCAAACAAA 
ACCTGATTCr 
AGGGGAAAGA 
GCAGACAAGA 
TGTTATAAAT 
TAAATATTTC 
AAAAAAGAAA 
TTTCATTAAC 
CTCAAAGACA 
TAT7TGAATA 



21 

1 

CACTCTGGCT 
ACCTTTGTGT 
CTGGAAAGCA 
ATCCACCATC 
CCTTTGCTTT 
CCACATCAAC 
GTCTGTCTTC 
TCTGGTCACA 
TACCGCTGGT 
CTTTGCCAGC 
ATTTTCTGTT 
CGCCCGATCC 
GGTTGGGCCC 
AGATGTXGAA 
A6GAAGCTAC 
AAAACCTQQA 
CCRATCTGGA 
CTCCTTAGAA 
TAGAAATGTG 
ATAATTTACA 
TATTTTTAAG 
AACCAATTTT 
CGTCTATCAa 
TTTATTCTAT 



31 

I 

GGGGAAGGCA 
ACCA6A0AGA 
6TCACAG0GG 
AACTGG6GTG 
GQACTCAGCA 
CCTQCAGTGA 
TACATCGCAG 
CCTCCCW3TG 
CATGGTCTCC 
TGTGATTCCA 
GCAATTGGAC 
TTTGGACCTG 
ATCATAGGAG 
TTCAAACQTC 
ATGGA6GTGG 
GTGGTOCATG 
GAGGTATTOT 
CTGTCCTCAG 
CAGGTTTQTT 
AAGGftGGAAC 
ATGTTCTTAA 
AAGCXsTGTGT 
CTTATTCCTT 
TAAACTGAGT 



41 

1 

TGAGTGACAG 
ACATCATGGT 
AATTTCTGGC 
GAAGAGAAAA 
TTGCAACCAT 
CTGTGGCCAT 
CCCAGTGCCT 
TGGTG6GAGG 
TGGTTGAGTT 
AAOGGACTGA 
ATTTATTTGC 
CAGTTATCAT 

crcrrccTC!GC 

GTTTTAAAGA 
AGGACAACA6 
TQATTQACXST 
CTTCAGTATG 
ATTTCCTTCC 
6TTTCATGTC 
GGAAGAAACC 
GCAAATATAT 
CAAGATTTGG 
CTCTAGTOQA 
TTAACAATGG 



GGAGGGGTG6 
GGGGCAGCTG 
ATCACCGTCC 
tCTCTGGAAC 
CGCATAATAA 
GCC3U3GACAG 
CCTACGCTCA 
CCT6CT6TGT 
TTCTTTTGAG 



51 
I 

QABISGEHIiR 
FQHLIiNfDSER 
IiERLFKQIiHP 
SDWRKVAQV 
RNLLDSMVIiI 
QGPGPSEKRR 
SDDRCWNGMA 
YNGNDVDFQD 



51 
I 

ACCCACAGCA 
GGCTTTCAAA 
CATGCTTATT 
GCCTTTACCG 
GGTGCAGTGC 
GGTGTGCACC 
GGGGGCCATC 
CCTGGGAGTC 
GATAATCACA 
TOTCACTGGC 
AAICAATTAT 
GGGAAATTGG 
TGGTGGCCTT 
AGCCTTCAGC 
GAGTCAG6TA 
TGACOGGGGA 
ACTA6AA6AT 
ACOCATTAAG 
ATATTACTCA 
TATTGTGAAT 
ACCTATTTTA 
TTAAGTCCTG 
ATATTQGTAT 
C 



Seq ID NO: 497 Protein sequence 
Protein Accession #: HP_001641.1 



41 



51 



1 11 21 31 

I I I I I t 

MSDRPTARRW GKG6PLCTRE NIHVAFKGVW TQAFMKAVTA EFLAHLXPVL LSUSSTINWG 
GTEKPLPVDM VLISLCFGLS lArmVQCFGB ISGGHINPAV TVAMVCTRKI SIAKSVFYIA 
AQCLOAIIOA GILYLVTPPS VVGGU3VTMV HGNLTAGHGL IjVELIITFQL VFTIPASCDS 
KRTDVT6SXA LAIGFSVAIG HLFAINYTGA SMNFARSFGP AVIKGNWENH WIYWVGPIIG 
AVIAGGLYBY VFCPDVEFKR RPKBAFSKAA QQTKGSYMEV BDNRSQVETD WjILKPGWH 
VIDVDRGEEK KGKDQSGEVL SSV 

Seg ID NO: 498 DNA sequence 

Nucleic Acid Accession ft: AB020684.1 

Coding sequence : 1 . . 1744 



CCCCCTTGTC 
TTG6TAC0G6 
GAOGGTTACC 
TGCTTGCTTT 
CATATATGGC 
CTTTTTCAAT 
CTCATATCCA 
ACTTTATAGA 
GCAOTTTGCT 
C6GGTACATT 
ACTTTGTTTT 
TTT6QTAATT 



11 
I 

A7TAATACAT 
ATTTATACCA 
AGAGGAGAAG 
TATGTTGCTG 
ACATATTTAA 
CAT6GAGAGT 
TTTCTTGTTC 
GGAAGCTTGA 
CAGTTTGTAC 
GATATAT6TA 
GTTTTGATGT 
ATTTGGGGTA 



21 
I 

TAAAAAGATT 
AAATAATGGA 
GACTCAGTCC 
TAATTTTTAT 
GTGGCAGCCS 
GTACCCGTGT 
TTCA6ATGTT 
TTGCRCTCT6 
TTCTTACTCA 
AATTACGGAA 
TTGGGAACTC 
TTCTGGCAAT 



31 
I 

CAATCTTTAC 
CTTGATTGGT 
TATTGAAAGC 
TTTAAATGGA 
ATTAGGAGGC 
AATGTGQACA 
GCTAGTGACT 
CATTTCCAAT 
GATTGCATCA 
GATCATTTAT 
AATGTXATTA 
GAAACCACA7 



41 

I 

CCTGftGGTAA 
ATTCAAACCA 
TGTGAAGGAT 
CTAATGATGG 
CTGGTTACAG 
CCACCTCTCC 
CATATTCTCA 
GTATTTTTCA 
TTATTTGCAG 
ATACACATGA 
ACTTCTTATT 
TTCCTGAAAA 



3180 
3240 
3300 
3360 
3420 
3460 
3540 
3600 
3660 



PCTAJS02/12476 



60 
120 
180 
240 
300 
360 
420 
4B0 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 



60 
120 
180 
240 
300 



51 
I 

TTTTGGCCAG 
A6ATATGTTG 
TGGGAGATCC 
CATTATTCTT 
TGTTGTGCTT 
GTGAAAGCTT 
GG6CTACAAA 
TGCTTCCTTG 
TATATGTTGT 
TTTCTCTTGC 
ATGCTTCTTC 
TAAATGTATC 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 



375 



wo 02/086443 

TGAACTTAGT TTATGGQTTA TTCAAGGATG TTTTT6GTTA TTTGGAACTG TCATACTTAA 780 

ATACTIGAGA TCTAAAATTT TTOOTATTGC AGAT6AG6CT CATATTGGCA ACTTACTAAC B40 

ATCAAAATTC TTTAGTTATA AGGATTTTGA TACTTTATTG TATACCTGTG CAGCQGAGTT 90 0 

TGACTTTATG GAAAAAOAGA CTCCACTGAG ATACACWAAG ACATTATTGC TTCCAGTTGT 9fi0 

TCTTGTAGTC TTTGTTGCTA TTGTTAGAAA GATTATTAGT GATATGTGGG GTGTCTTAGC 1020 

TAAACAACAG ACACATGTAA GAAAACACCA GTTTGATCAT GGAGAGCTGG rPTACCATGC 1080 

ATTGCAATTO TTA6CATATA CAGCCCTTGG TATTTTAATT ATGAGACTAA AACTCTTCTT 1140 

GACACCACAC ATOTOTGTTA T6GCATCACT GATCTGCTCA AGACAGCTAT TTGOATGGCT 1200 

CTTTTGCAAA GTACATCCTG GTGCTATTGT GTTTGCTATA TTAGCAGCAA TGTCAATACA 1260 

AGGTTCAGCA AATCTGCAAA CCCAGTGQAA TATTGTAGGG GAGTTCAGCA ATTTQCCOCA 1320 

AGAA6AACTT ATAGAATGGA TCAAATATAG TACTAAACCA GATGCAGTGT TT606GGTGC 1380 

CATGKXCA08 ATOGCAAGTG TTAAGCTCTC TGCACTTCGG CCCATTGTGA ATCATOCACA 1440 

TTATGAAOAC GCAflGCTTGA GAGCCAGAAC AAAAATAGTA TACTCAATGT ATAGTC3GGAA 1500 

AGCAGCCGAA GAAOTGAAGC GAGAACTGAT AAAGTTAAAA GTGAACTATT ACAT TCTAGA 1560 

AGAGTCATGG TGTGTAAQAA GATCCAAGCC TGGTTGCAGT ATGCCTGAAA TTTQQGATGT 1620 

AGAAGATCCT GCC3UITGCTG 6GAAAACTCC CTTATGTAAC CTCTTOGTOA AGQATTCCAA 1680 

ACCTCACTTC ACCACTGTAT TCCAGAACAG T6TTTACAAA 6TCCTAQAAG TTGTAAAAQA 1740 

ATGACTGCTA CATGACCTGC TGCCTACX3GA GAACTAC3^TC TGTAATGGTT TTAATGTTTT 1800 

GCTAAGTCAT GTGTTGTTCA TATCCCAAAA ACTTTTATAG GTAACT6TTT TCAA ATAGA A 1860 

AACGTTTTAT TTGGTCAATT TQAATGTCAT TCTAATTATA AAAATQACTT ACACCTTTAT 1920 

CAATTGGTTA CTATTTCAAT GCACCCTTTA AAATTT6CTA TGCAAAT6A6 TATATGCTTO 1980 

TACTTGACTT TAATATTTGT GCTAAAGTQA GCAAAGCTAC CTGTATAAAG AAAACACAGT 2040 

GGGTTGTQAC AAGGATGACA TGAAAATACA GGACAATTCT GACAATGTAG GGGCTG ATTT 2100 

TATAGTGTAA GAACTATTAA TGCCCCTTGC TTCTTTTTTC TGCCTCTTGC TC TTGT CTTT 2160 

TGGACATTTC AGTGATTGTA AaTTCTTGGG TCATGTCAGC COCTGT CATC AACTTG AOTT 2220 

ACAGTAGATG GGGCAGACAT GGAGTGTTTG CTATATAAAA CTATCTGTTT GTTTTACTTC 2280 

CTTGTGCGCT TTTTGTTCTC TGTTCTCTTG TTAATGAAGC TTTTCCTGCC CATTATTAAT 2340 

CCAAACTCTT GGACCTTGTG GTTAGGAAAT TCCCTTAACT TCCAGCCATA TGGCATTATC 2400 

GTGTCTCTTT CTCTCTCTCT CTTGCTCTCT CTCTTCTCCT CTTCCCCATA TTTTCTGTCA 2460 

AATAAGTACT OTTTACTCAT TTAGTTOCTT ATCAAGTACT TATTCTTGCTr T TTAAA AAAA 2520 

ATTAATGGTA ACTCTATTTT TCTCATTTTT AGCATTATTC AAATOTTTAT ATTTTAATAC 2580 

CTTTAAACCA CTTTAAAGTT TTTTCATGTT TAATTATAGT TTTAAGAAAA ACTATTTTGA 2640 

ACAACXCCAA ATATAGTGCA TCTAGAAACT AATGTATATT TGATTAGACA TCATTTATAG 2700 

TGGAACAGTA QACTGTAOTA CaVTGGTAATT TTTCTTTTAC TATTAAGATA CAAT AAAAC A 2760 

TGACTAATTT TGCTGTCAAA AATSTAAAOA ATAATOATAA ArOGAGTTTT TT aTATTTTA 2820 

CTTTTAAGAT TGCCTGTCTT TAATAAGAC3V AAGCCTTAAG CCTTATGTTA TAATTTTGGT 2880 

TCTAAAAACC ATCATTTCAG TATAAGGAAT AAGTATATTT CX3TCCTCCTC TTTAGTTTTT 2940 

TTCTTCCTAT TTATTTTTAT TTTGAAAAAT TTCTACACCT TCTTTGAATT CCTTGTATQA 3000 

ATTTTTGTTT CTTAGAAGTT AATTTGTGTG AAATGA6ATT CTTCAAAACG ATGAA ACCT C 3060 

ATAOCTCTGA GAAAA6GTTT TAGGGTmA AATTCIAAGC AAA006T0AC TATGQCTOAC 3120 

AGACTACACA TTTAATTATA CAGCTTCTCT TTCTTAACCA CAfiOCAGATT AACCTCATT6 3180 

TGGATTGTCX: TTCAGACCTT AQTCCTCAGG CATGGTTTGT GGTGCCCACT CCTGGAAGCC 3240 

GCTGTTCCCT TTCTACCTTC TTACCAGAGC CCAAGGGCAG GCCTGGTCCC GGGGAAGCAG 3300 

CAGCTTGCTG ACATAAGTCA GCTGCAAA6G CTGA6GA6TG TGCCCTCAGA GAA6CACC6C 3360 

GCCOCAGTCT TGTacCAOOS CCTAGA6CGB CAGCTCGCAG GGAT6CTCCT TOCCTGGAGG 3420 

• CAGCCCAGGA GAGQ6ACTCT GGCAGCX3TTC TTCA6ATTTG TGGCCACTGT TTCTCATTTG 3480 

CTGGTTGACT GnTrrATTT CTTAGGCTTT TGCTAGTTTT AGAAAATAGG GAAGCAGCCC 3540 

TTGATTTGTG GATTAAAAGC AACATTTGAG CX3ATGATGCA CAACAGTCCA GGAAAATGGG 3600 

06G7GGACAC TTGAGGCTGA GGATGGQAGT T6ACAT6AGC AGGGAGAGGG AGGTGCGCGC 3660 

TGCTTATCT6 TGATTGTT6C TCACCTGAGT GTGGCTGATT GTGTACATCC AGCAQTTACA 3720 

ATTTTTAAAA ATTATACTTT TACATTTATT TTATATTTTT CTCACCCCCA GTAATTTCCT 3780 

TCCAAAGAAG TTCACATGTA ATAAGTAQAA ATTCTGTATA GGAAAAAAGC ATTAAAAATA 3840 

CTATTATAAC TGCTTCATTT GCTGGGAACC ATTAAAAGTA ATATAAATTA GCTTTTTCCA 3900 

GAAG6ATCCT TTT6TAGCAG TCTTTATGAA TGTAACCCCC AGCftAAATAT GGCTATATAT 3960 

TAGGGGAGCC ASTTTGOAGC AQAGGCCTGA AGGTCCCTGC TAT6CAGCCX3 TGGCCACAGC 4020 

TC3GCR0CCCA A6CACTGTGG AGC3VT0CACA CCTTTGATGG C3UITGCAGAT TGGTAGCAGG 4080 

TTCCATAGGC GTACAAAACA 6TATTAAA6C TCAGTGTTTT GCATATTGTT AGCATTTACA 4140 

AATATTTTTG CTTTAGTATG AGGAAAGTAA GGATGGGCAA AGAAGCGATC AAAATA6CTA 4200 

TTGCTACStfW: ATTTTCGAAA ACAAAGTTGG GGCTGTATTT CTTTAAAAAG ATAAGCCTCT 4260 

AAAAATGCTT GGCAAAAAAA ATATAGTGTT AAAATAGGCC AGTGAT ATTA ATQAGAAAAT 4320 

GAAAGTATGT ATCAGGAATA AAGTGATATT GCATAGGAGT ATT6TATTTT TATGAATTTT 4380 
ATGCCAGTTG TTTACAT6TA CTATATATGT TAAATTAAAA AAAATCATGA GAAATG 

Seg ID NO: 499 Protein sequence 
Protein Accession 9: BAA74900.1 

1 11 21 31 41 51 

I I I I I 1 

PLVINTLKRF NLYPEVILAS WYRIYTKIMD LIGIQTKICW TVTRGBGLSP lESCEGI^P €0 

ACFYVAVIFI LNGLMMALFF lYGTYLSGSR IjGGLVTVLCP PPNHGECTRV MWTPPLRESP 120 

SYPFLVLQML LVTHILRATK LYRGSblALC ISNVFFMLPW QPAQPVLLTQ lASIiPAVYW 180 

GYIDICKIiRK IIYIHMISLA LCFVIjMFGNS MLLTSYYASS LVIIWGILAM KPHFUaNVS 240 

ELSIjWVIQGC PHLFOTVILK YLTSKIPGZA DDAHIGNLLT SKFPSYXDFD TLLYTCAAEP 300 

DPMBKETPLR YTRTLLLPW IiWPVAlVRK IISDMWGVLA KQQTHVRKHQ PCHGELVYHA 360 

LQIiLAYTALG ILIMRIiKLPL TPHMCVMASL ICSRQLFGWL FCKVHPGAIV PAILAAMSIQ 420 

GSANLQTQWN IVGEPSNLPQ EELIEWIKYS TKPDAVFAGA MPTMASVKLS ALRPIVNHPH 480 

YEDAGLRART KIVYSMYSRK AAEEVKRELI KLKVNYYIIiB ESWCVRRSKP GCSMPEIWDV 540 
EDPANAOKTP LCNLLVKDSK PHFTTVFQN8 VYKVLEWKB 

Seq ID NO: 500 DMA sequence 

Nucleic Acid Accession ft: NM_00127€.l 

Coding sequence* 127.. 1278 

1 11 21 31 41 51 

I I I I I ) 

AGTGGAGTGG GACAGGTATA TAAAGGAAGT AC3U3GGCCTG GGGAAGAGGC CCTGTCTAGG 60 

TAGCTGGCAC CAGGAGCCGT GGGCAAQGQA AGAGGCCACA CCCTGCCCTG CTCTGCTGCA 120 



376 
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5 
10 

15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



6CCAGAATGO 
TGCTCTSCAT 
GGGAGCTGCT 
GCCAATATAA 
ATGCTCAACA 
T6GAACTTT6 
TTCA'TCAAGT 
TGGCTCTACC 
GCCGAATTTA 
TCTGOGGGGA 
GATTTCATTA 
GACAOTCCCC 
TATGCF6TG0 
CCCACCTTCG 
TCAGGACCGG 
ATCTGTGACT 
GCCAGCAA66 
CAGTACCTGA 
TTCCAGGGCT 
GCACTCGCTG 
CCCCCTCTGG 
GCCTCAGTCT 
G0CCT6GTG0 
GACTCGGGAT 
TGGCAAGGGA 
TQGCAAGCTC 
TACCCCCTGC 
ACTTCCCCTT 
CGCTTTGCTT 
TCTTCTGGC3T 
ATGTT 



GTGT6AAS6C 
ACAAACT6GT 
TCCX3VGATGC 
GCAACGATCA 
CACTCAAGAA 
GGTCTCAAAG 
CAGTACCGCC 
CTGGACGGAG 
TAAAGGAAGC 
AGGTCACCAT 
GCATCATGAC 
TGTTCCOACG 
GQTACATGTT 
GGAOGAGCTT 
GAATTCCAGG 
TCCTCCGCOG 
GCAAGCAGT6 
A66ATAG6CA 
CCTTCTGCGG 
CAACGTAGCC 
CTCCAGCTGG 
CCCTCCCTTG 
GCA6AGAGGT 
TAGTACACAC 
ATTTCTTCAA 
TATCACCAAG 
AAAGCCAGCT 
CCTAATTCCA 
TGGTCTATCT 
TCCTTCCTCT 



GTCTCAAACA 
CTGCTACTAC 
CCTTGACCGC 

CATCGACACC 
CAGGAACCCC 
ATTTTCCAAG 
ATTCCTGCX3C 
AGACAAACAG 
CCAGCCAGGG 
TGACA6CAGC 
CTA06ATTTT 
TCAGQAGGAT 
GA6GCT6GGQ 
CACTCTGGCT 
CCGGTTCACC 
AGCCACAGTC 
GGTA6GATAC 
6CTGGCAG6C 
CCAGGATCTG 
CTCTGTTCTG 
CCGGGAGCCT 
GGGCCTATGC 
AGGGATGGGG 
TTGTTGATGA 
CTCCCTGCCC 
GAGCCAAACA 
TGAAACCTTC 
CAGCTGCTCA 
TTGAGCGCCC 
OAGCCTTGGG 



GGCTTTGTGG 
ACCA8CTGGT 
TTCCTCTQTA 
TGGGAGTGGA 
AACCTGAAGA 
ATAGCCTCCA 
ACCCAT6GCT 
CATTTTACCA 
AAAAA6CA6C 
TATGACATTG 
CATGGAGCCT 
GCAASTCCTG 
GCTCCTGCCA 
TCTTCTGAGA 
AAGGAGGCAG 
CATAGAACGC 
GACX3AGCA6G 
6CCATGGTAT 



CACACAGCAC 
GATCACCTGC 
AQAGGTCCAC 
CTGTGGGGAT 
TTAATGGAAA 
CCTAGCCCTC 
TCCTACAAGA 
ACTTAGGAAC 
ATAAAGTACA 
ACTAGACCCA 
ACGXICTGAGC 



Seq ID NOt 501 Protein sequence 
Protein Accession #i NP_001267.1 



1 

I 

mVKASQTGF 
ISNDHIDTHE 
XSVPPFUtTH 
GRVTIDSSYD 
VGYMLRLGAP 
DPLRGATVHR 
GSFOGQDLRF 



11 

I 

WLVLLQCC5 
HNDVTLYGML 
GFD6LDLAWL 
XAKISQHLDF 
ASKLVKGIPT 
TLGQQVPYAT 
PLTNAXKDAIj 



21 
I 

AYKLVCTfYTS 
NTLKSmNPUL 



XSZMTYDFKG 
FGR8FTLA88 
K(^QWVGYDO 
AAT 



31 

1 

VSQYRBGDGS 
KTLLSVGGWN 
TTLIKEMKAB 
AHRGTTGHHS 
ETGVGAPISG 
QESVKSKVQY 



Seq ID NO I 502 DKA sequence 
Nucleic Acid Accession #: Ml 
Coding sequencet 181.. 669 



006474.1 



1 
I 

GCTGCCTAGG 
TCCGGCCCCC 
TTCCCCCA6C 
ATGTGGAAGG 
GAAGGA6CCA 
GTTGCCATGC 
AAGTCTGGCT 
GAGGATCTGC 
6CCTCAAAG6 
GTTGA6AAAG 
GCCATCGGTT 
TCGCCCTAAA 
TTCT6ACTCT 
0G6GCCCAIT 
TCACCAGATT 



11 

I 

6TCTGGAAAG 
CCACCGTCGC 
TCAGAATCTT 
TGTCAGCTCT 
GCACAOGCCA 
CAG0TGCCX3A 
TGACAACTCT 
CAACTTCAGA 
TGGCCACCAG 
ATGGTTTQTC 
TCATTGGTGG 
GAGCTGAAGG 
GTGGCCCTCT 
CAGATTCCAC 
TG6TTGTTAA 



21 
I 

CTOGGGCACC 
6CTCCTCCAG 
GCTGCTCX3GC 
6CTCTTCGTT 
GCCAGAAGAT 
AGAT6ATGTG 
GGTGGCAACA 
AAGCACA6TC 
TCACTCCACG 
AACAGTGACC 
AATCATCGTT 
GTTACGCCCT 
CCCTG AGCT C 
GGTQACTTTC 
ACTTT 



31 

I 

CTCCCTCTCC 
GCTGGGCCTG 
CCCCAGQAGA 
TTQGQAAGOQ 
GACACTOAGA 
GTGACTCCAQ 
AGTGTCAACA 
CACGCGCAAO 
GAGAAAGTGG 
CTGGTT6GAA 
GTGGTTATGC 
GCTTGCCAAC 
GTG6GGAGAA 
CGTTTGCCAA 



TCCTGGTGCT 
CCCAGTACCG 

ccxacaTCAT 

ATGATGTGAC 
CTCTCTT6TC 
ACACCCAGAQ 
TXGATGGGCT 
CCCTAATCAA 
TCCTGCTCAQ 
CCAAGATATC 
GGOGTGGGAC 
ACAGATTCAG 
GTAAGCTGGT 
CTGGTGTTGG 
GGACCCTTGC 
TCQGCCAGCA 
AAAG06TCAA 
GGGOCCTGGA 
TCACXaATGC 
GGGGGCCAAG 
CCTGCTGAGT 
AACACACAGA 
AGTGAGGCAT 
TGTTTACAGA 
CTTATCAAAG 
CACAGTGACC 
GTAATCGTGT 
AGAGTTTAAC 
CTGQACTCAC 
TTQCASA6AT 



41 

I 

CFPDALDRFL 
FGSQRFSKIA 
FZKBAQF6KK 
PXtFRGQEDAS 
P6IP6RFTKE 
LKDRQLAGAM 



41 

{ 

GGGGCTCCTG 
TGGCCGCX3QT 
GCAACAACTC 
GGTGGCTCTQ 
CTAC!A6GTTT 
GAACCAGCGA 
GTGTAACAGG 
AACAAAGTCC 
ATGGAGACAC 
TCATA6TTGG 
GAAAAATGTC 
GTGCTTTAAA 
GATGACCCTG 
ATTAACGQAO 



6CTCCA6T6C 
GGAAGGOGAT 
CTACAGCTTT 
GCTCTACGGC 
TGTCGGAGOA 
TC6CCGGACT 
GOACCTTGCC 
GGAAAT0AA6 
C6CAGCACTG 
CCAACACCTG 
CACAGGCCAT 
CAACACTGAC 
GATGGGCATC 
AGCCCCAATC 
CTACTATGAG 
GGTCCCCTAT 
AA6CAAGGT6 
CCT6GATGAC 
CATCAAG6AT 
GATGCCCCGT 
CCX^GGCTGA 
TTTGAGCTCA 
CGCAATGTAA 
TCCCCAAGCC 
GACACCATTT 
ATACTAATTA 
CCCCTATCCT 
AGTGTGTTGG 
CTCCCCCATC 
GAAGGCCOCC 



51 

I 

CTHIIYSFAN 
8NTQSRRTFI 
QUiLSAALSA 
PDRFSNTDYA 
AGTLAYYEIC 
VHALDLDDFQ 



51 
I 

cTCCCS^cxrcc 

GCTTTTAATT 
AACGGGAA06 
GGTCCTG6CA 
G6AAGGGG6C 
AGACOGCTAT 
CATTCGCATC 
AAGCGCCACA 
ACAGACAACA 
GGTCTTACTA 
GGGAAGGTAC 
AAAAGACGGT 
GGAACATTTG 
QAAAQACCTT 



Seq ID NO: 503 Protein sequence 
Protein Accession ftt NP_006465.1 

21 



1 11 21 31 41 51 

i ) I I I I 

MWKVSALLEV LQSASLWVLA E6ASTGQPED DTETTGLEGG V3U4PGAEDDV VTFGTSEDRY 
KS6LTTLVAT SVNSVTGIRI EDLPTSESTV HAQEQSPSAT ASNVATSBST EKVD6DTQTT 
VEKDGLSTVT LVGIIVGVLL AIGFIGGIIV WMRKMSGRY SP 

Seq ID NO: 504 DNA sequence . 

Nucleic Acid Accession #t Eos sequence 

Coding sequence t 62 . . 8 95 

1 11 21 31 41 51 

t 1 1 t I I 

CACTGCTCTG AOAATTTGTG AGCAGCCCCT AACA6GCTGT TACTTCACTA CAACT6AGGA 
TATGATCATC TTAATTTACT TATTTCTCTT GCTATGGGAA 6ACACTCAAG GATGQGGATT 
CAAGGAT6GA ATTTTTCATA ACTCCATATG GCTT6AACGA GCAGCCOGTG TGTACCACAG 
AGAAGCACGG TCTGGCAAAT ACAAGCTCAC CTACGCAGAA GCTAAGGCQG TGTGTGAATr 
TGAAGGCGGC CATCTC6CAA CTTACAAGCA GCTAGAG6CA GCCAGAAAAA TT6GATTTCA 



160 
240 
300 

360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
is 00 
1860 
1920 



60 
120 
180 
240 
300 
360 



60 
120 
180 
240 
300 
360 
420 
460 
540 
600 
660 
720 
780 
840 



60 
120 



60 
120 
180 
240 
300 
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5 
10 
15 
20 
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70 
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TGTCTGTGCT GCTGGATGQA TGGCTAAGGG CAGAGTTG6A 
GCCXZAACTGT GQATTTQQAA AAACTQGCftT TATT^TTAt 
T6AAA6ATGG GATGOCTATT OCTACAACCC ACAC6CAAAG 
AGATCX»AAG CAAATTTTTA AATCTCCAGG CTTCCCMAT 
CTGCTACTGG CACATTAGAC TCAAGTATGG TCAGCGTATT 
T6ACCTTQAA OATGACCCAG GTTGCTTGGC TGATTATGTT 
TGATGTCCAT GGCITTGTG6 GAAGATACTG TGGAGATGAG 
TACAGGAAAT GTCATGACCT TGAAGTTTCT AAGTGATGCT 
CCAAATCAAA TATGTTGCAA TGGATCCTGT ATCCAAATCC 
TACTACTTCT ACTGGAAATA AAAACTTTTT AGCTGGAAGA 
AAAAAAAGGA TGATCAAAAC ACACAGTGTT TATGTTGOAA 
CTCACTGTTA TTATTAACAT TTATTTATTA TTTTTCTAAA 
TAG66AAAAT TGQAAAATAT AGGAAACTTT AAACGAGAAA 
ACTGCATAGA AATAACAAGC GTTAACATTT TCATATTTTT 
TTTGTGOTAT ATQTATATAT GTACCTATAT GTATTTQCAT 
TCTATGTACA GTTTTGTATT ATACTTTTTA AATCTTGAAC 
TCATTGATTA TTCTACAAAA ACATGATTTT AAACAGCTGT 
TGTTT TA TGC ATTATTTAAG CCTGTCTCTA TTGTTGGAAT 
ATTGTTGCAA TAAATATCCT T6AACACACA AAAAAAAAAA 

Seq ID NO: 505 Protein sequence 
Protein Accession #: Eos sequence 



PCT/US02/12476 



TACCCCATTG 
GGAATC06TC 
QAGT&IXSGIQ 
6A0TAGGAA0 
CACCTGAGTT 
GAAATATATG 
CTTCCAGATG 
TCAGTGACAG 
A6TCAAGGAA 
TTTAGCCACT 
TCTTTTGGAA 
TQTGAAAQCA 
ATGAAACCTC 
TTCTTTCAGT 
TTGAAATTTT 
TTTATAAACA 
AAAATATTCT 
TTCAGGTCAT 
AA 



TQAAGCCAGG 
TCAATAGGAO 
GG8TCTTTAC 
ATAACCAAAT 
TTTTAGATTT 
ACAGTTACGA 
ACATCATCA6 
CTGGAGGTTT 
AAAATACAAG 
TATAAAAAAA 
CTCCTTTGAT 
ATACATAATT 
TCATAATCCC 
CATTTTTCTA 
GGAATCCTGC 
TT7TCTGAAA 
ATGATATQAA 
TTTCATAAAT 



1 
I 

MIILIYLFLL 
EGGHLATYKQ 
ERWDAYCYNP 
DLBODPGCLA 
QIKyVAHDPV 



11 



21 



31 41 51 

i I I 

LWEDTQGWGF KDGIFBKSIW LBRAAGVYHR EARSGKYXLT YAEAKAVCBF 
LEAARKIGPH VCAAGWMAKG RVGYPIVKPG PNCX3PGKTGI IDYGIRLKRS 
HAKECGGVPT DPKQIPKSPG FPNBYEDNQI CYWHIRLKTC QRIHLSFLDP 
OYVBIYDSYD DVKGFVGRYC GDEEiPODIIS TGNVMTLKFL SOASVTAOGF 
SXSSQGKNTS TTSTGNXIIFL USRPSBh 



Seq ID NO: 506 DNA sequence 

Nucleic Acid Accession ft: IIM_00711S. 

Coding sequence: 6 9.. 902 



GAATTOSCAC 
CTGACmTAT 
6GGGATTCAA 
ACCACAGA6A 
GTGAATTTGA 
GATTTCATGT 
AGCCAGGGCC 
AXAG6A6TQA 
TCTTTACAQA 
ACCAAATCTO 
TAGATTTTGA 
6TTAGGATGA 
TCATCAGIAC 
6A6GTTTCCA 
ATACAAGTAC 
AAAAAAAAAA 
TTGATCTCAC 
TAATTTAG6G 
ATCCCACTGC 
TT6TATTTGT 
.CCTGCTCTAT 
TGAAATCATT 
ATQAAIGTTT 
TAAATATTGT 



11 

I 

TGCTCTGAQA 
QATCATCTTA 
GGKIGGAATT 
AGCAOGGTCT 
AGGCGGCCAT 
CTGT6CTQCT 
CAACTGAT6A 
AAGATGOGAT 
TCX3UUU30QA 
CTACrOGCAC 
CCTTGAAGAT 
TQTGCATGGC 
A6GAAATGTC 
AATCAAATAT 
TACTTCTACT 
AAGGATGATC 
TGTTATTATT 
AAAATTGSAA 
ATAGAAATAA 
GGTATATQTA 
GTACAGTTTT 
GATTATTCTA 
TAIGCATTAT 
TGCAATAAAT 



21 

I 

ATTTGTGAGC 
ATTTACTTAT 
TTTCATAACT 
GGCAAATACA 
CTCGCAACTT 
GGATGGATGG 
TTTGGAAAAA 
GCCTATTQCT 
ATTTTTAAAT 
ATTAGACTCA 
GACCCAGGTT 
TTT6TGGGAA 
ATGACCITGA 
GTTGCAATGG 
GGAAATAAAA 
AAAACACACA 
AACATTTATT 
AATATAGQAA 
CAAGOQTTAA 
TATAT6TACC 
GTATTATACT 
CAAAAACAT6 
TTAAGOCTGT 
ATCCTTCGGA 



31 

1. 

AGCCXCTAAC 
TTCTCTTGCT 
CCATATG6CT 
AGCTCACCTA 
ACAAGCAGCT 
CTAAGGGCAG 
CTGGCA7TAT 
ACAACCCACA 
CTCCAQ6CTT 
AGTATGGTCA 
GCTTGGCTGA 
GATACTGIG6 
AGTTTCTAAG 
ATCCTGTATC 
ACTTTTTAGC 
GTGTTTATGT 
TATTATTTTT 
ACTTTAAAOG 
CATTTTCATA 
TATATGTATT 
TTTTAAATCT 
ATTTTAAACA 
CTCIATT6TT 
ATTC 



41 

I 

AGGCTGTTAC 
ATGGGAAGAC 
TGAAOGAGCA 
OGCAQAAGCT 
AGAGGCA6CC 
AGTT6GATAC 
TGATTATGGA 
GQCAAAGGAG 
GOCftAATGAG 
GGGTATTCAC 
TTAT6TTGAA 
AGATGA6CTT 
TGATGCTTCA 
CAAATCCA6T 
TGGAA6ATTT 
TGGAATCTTT 
CTAAATOTGA 
AOAAAATGAA 

■ mvm ' LTT 

TGCATTTGAA 
TQAACTTTAT 
GCTGTAAAAT 
OGAAITTCAG 



51 

I 

TTCACTACAA 
ACTCAAGGAT 
GCCGGTGTGT 
AAS60G0TGT 
AGAAAAATTG 
CCCATTQTGA 
ATCOGTCTCA 
TGTGGTOGG6 
TROSA AGATA 
CTGAQTTTTT 
ATATATGACA 
CCAQATGACA 
GTGACA6CTG 
CAAGGAAAAA 
AGCCACTTAT 
TGGAACTCCT 
AAGAAATACA 
AGCTCTCATA 
TCAffTCATTT 
ATTTTGGAAT 
GAACATTTTC 
ATTCTATGAT 
GTCATTTTQl 



Seq ID NO: 507 Protein sequence 
Protein Accession NP_009046.1 

1 11 21 31 41 51 

I I I t I I 

MIILIYLFLL LWEDTQGWGF KDOIFHNSIH LBRAAGVYHR 6ARSGKYKLT YABAKAVCBF 
EGGHLATYKQ LEAARKIGPH VCAA6W^4AKG RVGYPIVKPG PNXXFGKTGI ZDYGIRLNRS 
ERWDAYCYNP HAKECGGVPT DPKRIPKSPG FFNEYEDNQX CYWHIRLKYG QRIKLSFLDF 
DLEDDPGCLA DYVEIYDSYD DVHGFVGRYC GOELPDDIIS TGNVMTLKFL SDASVTAGGP 
QIKYVAMDPV SKSSQGKNTS TTSTGNKNFL AGRFSHL 



360 
420 
480 

540 
600 
660 
720 
780 
840 
900 
960 
1020 
lOBO 
1140 
1200 
1260 
1320 
1380 



60 
120 
180 
240 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1360 



60 
120 
160 
240 



75 



80 



85 



Seq ID NOt 508 DNA sequence 

Nucleic Acid Accession NM_001044.1 

Coding sequence; 12 9.. 1991 ~ 



1 

I 

ACCGCTCCGG 
AAAGCCCAGG 
6T6TGCCCAT 
CTAAGGAGGC 
ACGGAGTOCA 
AGGATCGGGA 
TGGACCTGGC 



11 

! 

AGOGGGAGGG 
CC0GGGCG6C 
GAGrAAOAGC 
CAATGCCGTG 
GCTCACCAGC 
GACCTGGGGC 
CAA0GTCT66 



21 

1 

GAGGCTTOGC 
CASACCAAGA 
AAATOCTCOG 
GGCC06AAG0 
TCCACCCTCA 
AA6AAGATCG 
COGTTCCCCT 



31 
I 

GGAACGCTCT 
GGGAA GAAG C 
TGG6ACTCAT 
AGGTGGAQCT 
CCAACCC60G 
ACTTTCTCCT 
ACCTOTGCTA 



41 

I 

CGGCGCCAGG 
ACA6AATTCC 
GTCTTCOGTG 
CATCCTTGTC 
GCAGAGCCCC 
GTCOGTCATT 
CAAAAATGGT 



51 

I 

ACTCGCGTQC 
TCAACTCCCA 
GTGGOCCC60 
AAGGAGCAGA 
GTGGAGGCCC 
GGCTTTGCTG 
GGOGGTGCCT 



60 
120 
180 
240 
300 
360 
420 



378 



wo 02/086443 

TCCTSQTCCC CTACCTGCTC TTCATGGTCA TTGCTGGGAT GCCACTTTTC TACATCGAGC 480 

TG6CCCTCGG CCAGTTCAAC AGGGAAGGGG COGCTGGTGT CTGGAAGATC TGCCCCATAC 540 

TQ2VAAGGTC3T GGGCTTCAOG GTCATCCTC3V TCTCACTGTA TGTCGGCTTC TTCTACAACG 600 

TCATCATCJQC CTGQGCGCTG CACTATCTCT TCTCCTCCTT CACCACGGAG CTCCCCTGGA 660 

TCCACTGCAA CAACTCCTGG AACA6CCCCA ACTGCTCGGA TGCCCATCCT GGTGACTCCA 720 

6TQGAQACA6 CTCGGQCCTC AAOSACACTT TTGGGACCAC ACCTGCTGCC 6AGTACTTTG 780 

AAOGTOGCGT OCTQCACCrC CACCAGAGCC ATOGCATCGA OGAOCTGGGG CCTCCGCGGT 840 

Q6CAGCTCAC AGCCTOCCTG GTGCTGOTCA TOGTGCTGCT CTACTTCAGC CTCTGGAAQG 900 

6CGTGAAGAC CTCAGGGAAG GTGGTATGGA TCACAGCCAC CATGCCATAC 6TGGTCCTCA 960 

CTGCCCTGCT CCTGCGTGGQ GTCACCCTCC CTGGAGCCAT AGACGGCATC AGAGCATACC 1020 

TGA6CGTTGA CTTCTACOGG CTCT6CX3AGG CGTCTGTTT6 GATTGA0G06 GCCAGCCAG6 1080 

TGTGCTTCTC CCTQGGCGTO GGOTTCGGGG TGCTQATG6C CTTCTCCA6C ThCAACAAGT 1140 

TCACX»ACAA CT6CTACAGG GAOQOGATTG TCACCACCTC C31TCAACTCC CTGACGAOCT 1200 

TCTCCTCCQG CTTCX5TCX3TC TTCTCCTTCC TGGGGTACAT GGCACAGAAG CACAGTGTGC 1260 

CCATCGGGGA CGTGGCCAAG GACQGGCCAG GGCTGATCTT CATCATCTAC CCGGAAGCCA 1320 

TCGCCAC3GCT CCCTCTGTCC TCAGCCTGGG CCGT6GTCTT CTTCS^TCATG CTGCTCACCC 1380 

TGGGTATCGA CAGCGCCATG GGTGGTATQG AGTCAGTQAT CRCOQQGCrC ATOGATGAGT 1440 

TCCAGCTQCT GCACAGACAC 0GTGA6CTCT TCACGCTCTT CATOGTOCTG GCQAOCTTCC ISOO 

TCCTOTCCCT GTTCTGOGTC ACCAACGGTG GCATCTACGT CTTCACGCTC CTGGACCATT 1560 

TTGCAOCCGG CACGTCCATC CTCTTTGGAG TGCTCATCX3A AGCCATCGGA GTGGCCTGGT 1620 

TCTAT6GTGT TGGGCAGTTC AGCGACGACA TCCAGCAQAT GACCGGGCAO CSGGCCCAGCC 1680 

TGTACTGGOG GCTGTGCTGG AAGCTGGTCA GCCCCTGCTT TCTCCTSTTC GTGOTCQTGO 1740 

TCAGCaTTGT 6ACCTTCAGA CCCCCCCACT ACG6AG0CTA CATCTTCCCC OACTGGGCCA 1800 

AOGCGCTGGO CTGGGTCATC GCCACATCCT CX^ITGGCCAT GGTGCCCATC TATGOGGCCT 1860 

ACAACTTCTG CAGCCTGCCT GGGTCCTTTC GAGAGAAACT GGCCTACX3CC ATTGCACCCG 1920 

AGAAGGACCG TGAGCTGGTG GACAQAGGGG AGGTGCX3CCA GTTCACGCTC CGCCACTGGC 1980 

TCaAGGTGTA GAOGGAGCAG AGACGAAQAC CCCAGGAAGT CATCCTGCAA TGGGAGAGAC 2040 

ACX3AACAAAC CAAGGAAATC TAAGTTTC6A GAGAAAGGAG GGCAACTTCT ACTCTTCAAC' 2100 

CTCTACTGAA AACACAAACA ACAAAGCAGA AGACTCCTCT CTTCTGACTG TTTACACCTT 2160 

TocorracooG OAGcacACcr cxscogtqtct tgtgttgctq taataacgac gtagatctgt 2220 

6CA6CGAGGT CCACCCCGTT GTTGTCCCTG CAGGGCAGAA AAAOGTCTAA CTTCATGCTG 2280 

TCTGTGTGAG GCTCCCTCOC TCCCTGCTCC CTGCTCCCGG CTCTGAGGCT GCCCCAGGGG 2340 

CACTGTGTTC TCAGGCGGGG ATCACGATCC TTGTAGACGC ACCTGCTGAG AATCCCOGTG 2400 

CTCACAGTAG CTTCCTAGAC CATTTACTTT GCCCATATTA AAAAGCCAA6 TGTCCTGCTT 2460 

GGTTTAGCTG TOCAGAAGGT GAAATGGAGG AAACCACAAA TTCATQCAAA GTCCTTTCCC 2520 

GATGCGTGGC TCCCAGCAGA GGCOGTAAAT TGAGCGTTCA 6TTGACACAT TGCACACACA 2580 

GTCTGTTCAG AGGCATTGGA GGATGGGG6T CCTGGTATQT CTCACCAGGA AATTCTGTTT 2640 

ATGTTCTTGC AGCAGAGAGA AATAAAACTC CTTQAAACCA GCTCAQGCTA CTGCCACTCA 2700 

GGCAGCCTGT GGGTCXTTTGr GGXGTAGGGA AOGGCCTGAO AGGAGCGTGT CCTATCCCCG 2760 

QAOSCATGCA GG6C0CCCAC AGQAGGSTGT CCTATCCCCG OAGGCATGCA GOGCOCCCAC 282 0 

AGGAGCATGT CCTATCCCTO GACGCATGCA GG6CCO0CAC AGGAGOGTGT ACTACCCCAG 2880 

AACGCATGCA GGGCCCCCAC AGGAGOGTGT ACTACCCCAG GACGCATGCA G6GCCCCCAC 2940 

TGGAGCGTGT ACTACCCCAG GACGCATGCA GGGCCCCCAC AGGAGCGTGT CCTATCCCCG 3000 

GACOGGACGC ATQCAGGGCC CCCACAGGAG CGTGTACTAC COCAGGACGC ATGCAGG6CC 3060 

CCCACAGGAG CGTGTACTAC CCCAGGATGC AT6CRQG0CC CCCACAGGAO OSTOTACTAC 3120 

CCCAGGACGC ATGCAGGGCC CCCATGCAGG CAGCCTGCA6 ACCAACACTC TQCCTGGCCT 3180 

TGAGCCGTGA CCTCCAGGAA GGGACCCCAC TGGAATTTTA TTTCTCTCAG GTGCGTGCCA 3240 

CAXCAA.TAAC AACRGTTTTT ATGTTTGCGA ATGGCTTTTT AAAATCATAT TTACCTGTGA 3300 

ATCAAAACAA ATTCAAGAAT GCAGTATCCG CQAGCCTGCT TGCTGATATT GCAGTTTTTG 3360 

TTTACAAGAA TAATTAGCAA TACTGAGTGA AGGATGTTGG CCAAAAGCTG CTTTCCATGG 3420 

CACACTQCCC TCTGCCACTG ACAGGAAAQT GGATGCCATA GTTTGAATTC ATGCCTCAAG 3460 

TCGGTGQGCC TGCCTACGTG CTGCCCGAGG GCAGGGGCCG TGCAGGGCCA GTCATGGCTG 3540 

TCCCCTQCAA GT0QACQTG6 GCTCCAGGGA CTGGAGTGTA ATGCTCGGTG GGAGCCGTCA 3600 

OCCTOTGAAC TGCCAGGCAG CTGCAGTTAG CACAGAGGAT GGCTTCCCCA TTGCCTTCTO 3660 

GGGAGGQACA CAGAGGACGG CTTCCCCATC GCCTTCTGGC CGCTGCAGTC AGCAGAGAGA 3720 

GCGGCTTCCC CATTGCCTTC TGGGQAGGGA CACAGAGGAC AGTTTCCCCA TOGCCTTCTG 3780 

GTTGTTQAAG ACAGCACAGA GAGCGGCTTC CCCATGGCCT TCTGGGGAGG GGCTCCGTGT 3840 

AGCAACCCAG 6TOTTGTCGG TGTCTGTTGA GCAATCTCTA TTCAGCATCQ TQTGGGTCCC 3900 
.TAAGGACAAT AAAAGACATC CACAATG6AA AAAAAAAAAG QAATTC. 

Seq ID NO: 509 Protein sequence 
Protein Accession NP_0D1035.1 

1 11 21 31 41 51 

I I I i I I 

MSKSKCSVGL MSSWAPAKE PNAVGPKEVE LILVKEQNGV QLTSSTLTMP RQSPVEAQDR 60 

ETWGKKIDFL LSVIQPAVDL ANVWRPPYLC YKNGGGAFLV PYLLFMVIAG MPLPYMELAL 120 

GQPNREGAAG WfKICPILKG VGFTVILISL yVGFFWJVII AHAIiHYItFSS PTTEUWIHC 160 

NNSWNSPNCS DAHPGDSSGD SSGLNDTPGT TPAABYFERG VLHLHQSHGI DDLGPFRWQL 240 

TACLVLVIVL LYPSLWKGVK TSGKWWITA TMPYWLTAL LLRGVTIjPGA IDGIRAYLSV 300 

DFYRLCEASV WIDAATQVCF SLGVGPGVLI AFSSYNKFTN NCYRDAIVTT SINSLTSFSS 360 

OPWPSFLGY MAQKHSVPIG DVAKDQPGLI FIIYPBAIAT LPIiSSAWAW FPIMI/LTLOI 420 

DSAMGGMESV ITGLIDBFQL LHRHRELPTL FIVLATFLLS LPCVTNGGIY VFTItLDRFAA 480 

GTSILFGVLI EAIGVAWFYG VGQFSDDIQQ MTGQRPSLYW RLCWKLVSPC FLLFVWVSI 540 

VTFRPFHYGA YIFPDWANAL GHVIATSSMA MVPIYAAYKF CSLFGSFREK LAYAIAFSKD 600 
RBLVDROEVR QFTLRHWLJCV 

Seq ID NO: 510 DNA sequence 

Nucleic Acid Accession #: IIM_001216.1 

Coding sequence: 4 3.. 1422 

1 11 21 31 41 51 

I I I i 1 I 

GCCCGTACAC ACCGTGTGCT GGGACACCCC ACAGTCAGCC GCATGGCTCC CCTGTGCCCC ' 60 

A6CCCCTGGC TCCCTCTGTT GATCCCGGCC CCTGCTCCA6 GCCTCACTGT GCAACTGCTG 120 

CTGTCACTOC T6CTTCTGAT GCCTGTCCAT CCCCAGAGGT TGCCCOGGAT GCAGGAGGAT 180 

TCCCCCTTGO GAGGAGGCTC TTCTGOGGAA 6ATGACCCAC TCGGCGAOGA GGATCTQCCC 240 
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AGTGAAOAGG ATTCACCCAG AGAGC3AGGAT CCACCC3GGAG 
GAGGA.TCTAC CTGGAGAGGA GGATCTACCT GAAGTTAAGC 
TCOCTGAAGT TAGAGGATCT ACCTACTGTT GAGGCTCCTG 
AATAATGCCC ACAGGGACAA AGAAGGGOAT GACCAGAGTC 
CCGCCCTGGC CCCX3GGTGTC CCCAGCCTGC 60GGGC0GCT 
CGCCCCCAGC TCGCCGCCTT CTGCCOGGCC CTGC3GCCCCC 
CTCCCX3C06C TCCXSVOAACT GCJGCCTGCXSC AACAATGGCC 
CCTCCTGGGC TAQAGATGGC TCTGGGTCX:C GGGCGGGAQT 
CTGC»CTGGG GG6CTGC3U3G TCGTCOGGGC TCX3GA GCAC A 
CCTGCCGAOA TCCACOTGGT TCACCTCAGC AOOGGCTTTG 
GOGOGCCOQG GAGGCCTGGC OGrGTTGGOC 60CTTTCTGG 
AGTGCCTATG AGCAGTTGCT GTCTCGCTTQ QAAGAAATCG 
C3VGGTCCCA0 GACTGGACAT ATCTGCACTC CTGCCCTCTG 
TATGAGGGGT CTCTGACTAC ACOGCXJCTGT GCCCAOGGTG 
CAGACAGTGA TGCTGAGTGC TAAGCAGCTC CACAOCCTCT 
GGTGACTCTC GGCTACAGCT GAACTTCOSA GCQAOGCAGC 
GAGGCCTCCT TCCCTGCTGG AGTGGACAGC AGTCCTOSGG 
AATTCCTGCC TGGCTGCTGG TGACATCCTA GCCCTGGTTT 
ACCAGCGTOG CGTTCCTTGT GCAOATGAGA AGGCAGCACA 
GTGAGCTACX: GCCOGCAGA 6GTA6CC9SAG ACTGGA6CCT 
TG7GAGAAGC CAGCCAGAGG CATCT6AGG6 GGAGCCGGTA 
ATGCCACTTC CTTTTAACTG CCAAGAAATT TTTTAAAATA 
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Seq ID HO: 511 Protein sequence 
Protein Accession #: 1IP_00X207.1 



1 

r 

MAFLCP8PWL 
GEEDItPSEBD 
DPQEPQNNAH 
ELL6FQLPPL 
VEGHRFPAEI 
EEGSETQVPG 
OTUHGPGDSR 
GLLFAVTSVA 



11 

I' 

PUjIPAPAPG 
SPREEDPP6E 



21 
I 

LTVQLLLSLL 



PELRLRMNGH 
HWHIiSTAFA 
LDISALLPSD 
LQLNFRATQP 
PLVQMRRQHR 



HRYGGDPPHP 
SVQLTLPPGL 
RVDEALGRFG 
FSRYFQYB6S 
LNSRVISASF 
R6TKS6VSYR 



31 
I 

LLMPVHFQRL 
GEEDIiPBVKF 
RVSPACAGRF 
EMALGPGRRY 
GLAVLAAFLE 
LTTPPCAQGV 
PAGVCSSPRA 
PAEVAETGA 



AGGAGGATCT 
CTAAATCAGA 
GAGATCCTCA 
ATTGGGGCTA 
TCCAGTGCCC 
TGGAACTCCT 
ACAGTGTGCA 
ACCGGGCTCT 
CTGTGGAAGQ 
CCAQAGTTGA 
AGQAGGGCrC 
CTQAGGAAGG 
ACTTCAGCCG 
TCATCTGGAC 
CTGAO^CCCT 
CTTT6AATGG 
CTGCaX3AC3CC 
TTGGCCTCCT 
GAAGGGGAAC 
AGAG6CT6GA 
ACTGTCCTGT 
AATATTTATA 



41 

I 

PRMQEDSPLG 



QSPVDIRPQL 

RALQI'HLHHG 
BGPEENSAYE 
IWTVFNQTVM 
ABPVQXiNSCL 



ACCTGQAGAG 
AGAAGAGGGC 
AGAACCCCAG 
TG6AGGCX3AC 
GGTGGATATC 
G6GCTT0CAG 
ACTGACCCTO 
GCAGCTGCAT 
CCACCGTTTC 
CSGAGGCCTTO 
GGAAGAAAAC 
CTCAGAGACT 
CTACTTCCAA 
TGTGTTTAAC 
GTGGGQACCT 
GOGAGTQATT 
AGTCCAGCTQ 
TTTTGCTGTC 
CAAAGGGGGT 
TCITCXSAGAA 
CCTGCTCATT 
AT 



51 

I 

GOSSGEDDPL 
EDXiPTVBAPG 
AAFCPALRPL 
AAC3RPG5EHT 
QLLSRLSEIA 
LSAKOIiHTLS 
AAGDIIiALVF 



Seq ID KO: 512 DNA sequence 

Nucleic Acid Accession #t Eos sequence 

Coding sequence i X . . 3 978 



1 

1 

ATGGTGGGTG 
TnCCAGAAA 
TTAGCACCCA 
ACGC06GTGA 
TCGACATATG 
GTAGCAAGGG 
ACACGG8T8T 
CCGACA6TTC 
GTTGGCATTG 
GCCCTTGCCT 
TTGGTTTTTG 
CTCAATATAC 
CCAGCCACCA 
CCCACA6CTC 
OCCAAGCTCA 
ACa^ATGAATG 
TTTACCAACA 
TTTGTCCAAA 
ACATTATCCT 
ATTGCCATGT 
ATGGCTGAAG 
CCATCTTACA 
TTGACATGGG 
AGGCATTTAT 
GGAGCCACTG 
TTTGTGGTGA 
GCAGTGTTTG 
AAGGATGAAT 
6CCAAATACC 
CTCCTT6CAG 
ACTTTGOCCT 
CTCTTTGGAG 
CAGAAGGACC 
AACCTCTCTG 
CAGCTCTACC 
' TTT6A66AGT 
CTACAGTTCT 
AAGGGAACCC 
CTGOQAQGAT 
TTCAAGGAGA 
CTCTTCACTG 
CTQGGTCTCT 
ATGTGTQAGG 
ACTGCAAGCA 



11 

I 

AAGGACCCTA 
GATAIGAOCC 
ACGOGGTGGA 
TGGTGAAAQG 
ACTCATCTGA 
TGGGTCCTGA 
TGATGGACAT 
TCATTCACCA 
GACTGTGCAT 
GGGCCATCAA 
AAAACCTAGT 
TGTCAAGTGA 
TCCOGATCCT 
TCATOGGQAT 
ATTCAGCTTT 
AGTTTCTGAC 
CTATCCAAGA 
GTGGAAACTC 
GCCACATCCT 
TTAATGTAAT 
CGAATGTCTC 
TCACCXIAACC 
AGCATGAAGC 
GCAAGAAACA 
GCCCAGAGGA 
GAAAGTTATG 
TTGGGAGAAT 
CTAGAAGGC7 
TGGGGAAOAT 
CTCTCCTAG6 
ACGTTTCACA 
AAAAGTATGA 
TGAGCAACCT 
GGGGGCAGAG 
TGCIGGAOGA 
6CATTAAGAA 
TAGAQTCTTG 
ACAAGGAGTT 
TGCAQTTCAA 
GC^CTGCTGA 
TGTTCCTCTT 
GGTTGGACAA 
TCGGCGCGGT 
TGGTGTTCAT 



21 

1 

CCTTATCTCA 
CAGCCT6AA0 
TGATGCOGGG 
CTACCGGCAA 
CACCAATGCC 
GAAGGCCTCT 
CGTGGCCAAC 
AATCCTCCAG 
AGCCCTTTTT 
CTACCGCACG 
GTCCTTCAAG 
TAGCTATTCT 
AATGGTCTTT 
ATCAGTGTAT 
CCGAAGGTCA 
CTGCATCAGG 
TATAAGAAGG 
TGCCCTGGCC 
CCTGAGAGGC 
GAAGTTTTCC 
TCTAAGGAGA 
AGAAGACCCA 
CAGCAGGAAA 
GAGGTCAGAG 
GCAAAGTGAC 
TCGTTATCXJC 
CATCAGAGGA 
TCTTACTTGG 
CTTGGGAATA 
ACAGATGCAO 
GCAGGCATGG 
TCACCAAAGO 
COCCTATGGA 
GCA6AGGATT 
CGCCCTGTOG 
6AC6CTCAGG 
TGATGAAGTT 
AATGGAGGAG 
GQATCCT6AA 
GAGAGAGGAA 
CCTCCTGATG 
GGGCTCACGG 
GCTGGCAGAC 
GCTGGTGTTT 



31 
1 

GATCTGGACC 
AOCATGATCC 
CTACTCTCCT 
AGGCTGACOG 
AAAAGATTTC 
CTGAGCCACG 
ATCCTOTGCA 
CAGACTGAGA 
GCCAC06AGT 
GCCATCOGGT 
ACATTGACCC 
TTGTTTGAAG 
TGTGCGGCGT 
GTCATATTCA 
6CAATTTTGG 
CTGATCAAAA 
AGGGAAAGAA 
CCCATCGTGT 
AAACTCACC6 
ATTGCAATCT 
ATGAAOAAAA 
GATACTGTCT 
AGTACCCCAA 
GCATACAGTG 
AGCCTCAAAT 
GAAGCCCAGC 
TACAGGCCTC 
CCCCAAGAAG 
TGTG6QAATG 
CT6C3U3AAAG 
ATCTTTCAT6 
TATCAGCACA 
GACCTGACTG 
AGCCTGGCCC 
GCGGT0GA08 
GGAAA6ACAG 
ATTTTATTAG 
AGAGGGCGCT 
CACCTTTACA 
GATGCTOOTA 
ATTGQCAGCG 
ATGACCTGTG 
ATOGGTCAGC 
GGGGTCACCA 



41 
1 

AGCGAGGCOG 
CA6TGCGACC 
TOGCCACATT 
TAGACACCCT 
GAGTCCTTTG 
TGGT6TGGAA 
TCATCATGGC 
GGACCTCTGG 
TTACCAAAGT 
TGAAGGTGGC 
ACATCTCTGT 
CTGCCTTGTT 
ACXaCCTTTTT 
TAOCOGTCCA 
T8ACAGACAA 
TGTAT6CCT6 
AATTACTGGA 
CCACCATAGC 
CACCOGTGGC 
TGCCCTTCTC 
TTCTCATAQA 
TGCTTTTAGC 
AGAAATTGCA 
AGAGGAGTCC 
C98QTTCT6CA 
TCCTGGCTTG 
ATQQATTTTC 
TGGATAGGAC 
TGGGAAGTGG 
GGGTG6TGGC 
GAAATGTGAG 
CAGTCCGC3GT 
AGATTGGGGA 
GCGCTGTCTA 
COCACQTOGO 
T06TCCTGGT 
AAOATGGAGA 
ATGCAAAACT 
ATGCAGCAAT 
TAATOGOGXA 
CTQCCTTCAa 
GGCCCCAGGG 
ATGTGTACCA 
AAGGCTTCGT 




300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 



51 
I 

GOGGAGATCC 
CTGTGCAAGG 
TTCCTGGCTC 
QCCCCCATTG 
GGATGAAGAG 
ATTCCAGAGG 
AGCCATAGGG 
GAAAGTCTGG 
CTTCTTTTGG 
GCTCTOCAOC 
TGGOSAGGTG 
TTGTCCTTTG 
CATTCTGGGG 
GATOTTTATG 
GCX3AGTTCAG 
GGAGAAATCT 
AAAAGCTGGA 
CATCGTGCTG 
ATTTAGTGTG 
CATCAAAGCA 
TAAAAGCCCC 
AAATGCCACC 
GAACCAGAAA 
ACCA6CCAAG 
CA6CATAA6C 
GAGGIGGCCA 
TGCTAAAGAC 
TCAAAGGGCA 
AAAGAGCTCC 
AGTCAATGGA 
A6AAAACATA 



60 
120 
180 
240 
300 
360 
420 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
228C 
2340 
2400 
2460 
2520 
2580 
2640 
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ACCACACTGA TGGCATCCTC CTCTCTGCAT QACACGGTGT TT6ATAAGAT CTTAAAGAGC 2700 

CCAATGA6TT TCTTT6ACAC GACTCCCACT G6CAGGCTAA TGAACCGTTT TTCCAAOGAT 2760 

ATQQACGAGC TGGATGTGAO GCTOCCQTTT CACGCA6AGA ACTTTCTGCA 6CAGTTTTTT 2B20 

ATGGTGGTGT TTATTCTOGT GATCTTGaCT GCTGTGTTTC CTGCTGTCCT TTTAGTCGT6 2BfiO 

GCCAGCCTTG CTGTAGGCTT CTTCATTCTG TTACGCATTT TCCACAGAGG AGTCCAGQAG 2940 

CTCAAGAA6G T6GAGAATGT CAGCOGGTCA CCCTGGTTCA CCCACATCAC CTCCTCCATG 3000 

CAGGOCCTGG 6CATCATTCA 0GCCTATG6C AAGAAGGAGA GCT6CATCAC CTATACTTCA 3060 

TCCAAAGGCC TGTCATTOTC ATACATCATC CAQCTGAGOG QACT6CT0CA AGTGTGTGTQ 3120 

CQAACGGGAA CAGAGACGCA AGCCAAATTC ACCTCC6TGG AGCTGCTCAG GGAATACATT 3180 

TCGACCTGTG TTCCTGAATG CACTCATCCC CTCAAAGTGG GGACCTGTCC CAAGGACTGG 3240 

CCCAGCTGTG GGGAGATCAC CTTCAGAGAC TATCAGATOA GATACAGAGA CAACACCCCC 3300 

CTT6TTCTCG ACAGCCTGAA CTTGAACATA CAAAGTGGGC AGACAGT0G6 GATTGTTGGA 3360 

AGAACAGGTT CCGGAAA6TC ATC6TTAGGA ATGGCTTTGT TTCGTCTGGT GGAGGCASCC 3420 

AGTGGCACAA TCTTTATTGA TGAGGTGQAT ATCTGCATTC TCAGCTTG6A AGACCTCAGA 3480 

ACCAAGCTGA CTGTGATCCC ACAGGATCCT GTCCTGTTTG TAGGTACAGT AAGGTACAAC 3540 

TTGGATCCCT TTGAGAGTCA CACCGATQAG ATGCTCTGGC AGGTTCTGGA GAGAACATTC 3600 

ATGAGAGACA CAATAATGAA ACTCCCAGAA AAATTACAGG CAGAAGTCAC AGAAAATG6A 3660 

QAAAACTTCT CAOTAGGGGA AC6TCAGCTG CTTTGTGTGG CCOGAGCTCT TCTCOGTAAT 3720 

TCAAAGATCA TTCTCCTTGA TOAAGCCACC GCCTCTATGa ACTCCAAGAC TGACACCCTG 3780 

OTTCAGAACA CCATCAAAQA TGCXTTTCAAG GGCTGCACTG TGCTGACCAT OGCCCACCGC 3840 

CTC31ACACAG TTCTCAACTG CGATCAC6TC CTGGTTATGG AAAATGGGAA GGTGATTGAG 3900 

TTTGACAAGG CTGAAGTCCT TGCAGAGAAG CCA6ATTCTG CATTTGCXSAT GTTACTAGCA 3960 
GGAOAAGTCA GATTGTAG 

Seq 10 nOi 513 Protein sequence 
Protein Accession #: Bos sequence 

1 11 ' " 21 ' 31 " 41 "51 

] \ \ \ \ \ 

MVGB9PyLI8 DIiDQSGRRRS FABRYDPSLK IWIPVRPCAR ZiAPNFVIlDAG LtiSFATPSWL 60 

TPVMVKGYRQ RLTVDTLFPL STYDSSDTNA KRFRVLWDEE VARVGPBKAS LSBWWKFQR 120 

TRVLMDIVAN ILCIIMAAIG PTVLIHQILQ QTERTSGKVW VGIGLCIALP ATBPTKVPFW 180 

ALAWAINYRT AIRLKVALST LVFENLVSFK TLTHISVGEV LNILSSDSYS LFEAALFCPL 240 

PATZPILMVF CAAYAFFILG PTALIGISVY VIFIFVQMFM AKliNSAFRRS AILVTDRRVQ 300 

TVQIEFIiTCIR LIKMITAWEKS F7NTIQDZRR RBRKIiLRXAG FVQSGKSALA PIVSTZAIVL 360 

TLSCHZXiIjRR KLTAPVAFSV IAMFNVMKFS IAIIiPPSIKA KABANVSLRR MKKILIDKSP 420 

PSYITQPEDP DTVLLUVMAT LTWEHEASRK STPXKLQNQK RHLCKKQRSE AYSBRSPPAK 480 

GATGPEBQSD SLKSVLHSIS FWRKLCRYP EAQLLAWRWP AVFVGRIIRG YRPHGFSAKD 540 

XDBSRRItLTW PQEVDRTQRA AKYLGKILGI GGNVGSQKSS LLAALLGQMQ LQKGWAVNG fiOO 

TIAYVSQOAH IFHGHVRENZ LFGEKSDHOR y(^TVRVC9GL QKDLSNLPYG DLTEZ6BR6L 660 

NLSGGORORI SLARAVYSDR QLYLLDDPbS AVDAHVOXRV FEECIKKTLR 6KTWLVTBQ 720 

LQPLESCDEV ILLEDGBICB KGTHKEU4EE RGRYAKLIHN LRGLQFKDPE HLYNAAKVBA 780 

FKESPAEREE DAGIIGYLLS LFTVFLFLLM IGSAAFSNVIH LGLHLDKGSR MT06PQGNRT 840 

MCEVQAVLAD IGQHVYQWVY TASMVFMLVF GVTKGFVFTK TTLMASSSLH DTVFDKILKS 900 

PMSFFDTTPT 6RLMNRFSKD MDELDVRLPF HAENFIjQQPF MWFILVILA AVFPAVLLW 960 

ASLAVGFFIL LRIFHRGVQE LKKVENVSRS PWFTEITSSM Q6LGIIKAYG RKB8CITYTS 1020 

SKGLSLSYII QLSGLLQVCV RTGTETQAKF TSVELLREYI STCVPECTHP LKVGTCPKDW 1080 

PSCGEITFRD YQMRYRDNTP LVLDSLNUII QSGQTVGIVG RTGSGKSSLG NALFRLVEPA 1140 

SGTIFIDEVD ICXLSLEDLR TKLTVIPQDP VLFVGTVRYN LDPFBSBTDG MLWQVIiERTF 1200 

MRDTIMKLPB KLQAEVTEN6 ENPSVOBRQL LCVARALURN SKIILLDEAT ASMDSKIDTL 1260 

VQNTIKDAFR GCTVLTIAHR UlTVLNCDHV LVKENGRVIE FDKPEVLABK PDSAFAMUA 1320 
AEVRL 

Seq ID NOi 514 DNA sequence 
KUdeic Acid Accession #t Z31560 
coding sequence I 1-966 

1 11 21 31 41 51 

I ) . I I I I 

CACAGOSCCC GCATGTACAA CATGATGGAG ACGGAGCTGA AGCCGCCQGG CCCGCAGCAA 60 

ACTTCGGGGG GCGGCGGCGG CAACTCCACC GOGGOGGOSG CCGGCGOCAA CCAQAAAAAC 120 

AGCCCGGACC GCGTCAAGCG GCXCATGAAT GCCTTCATGG TQTGGTCCCG CGGGCAGCGG 160 

CQCAAGATG9 CCCAGQAGAA CCCCAAGAOXS CACAACTCG6 AC3ATCAGCAA 6G6CCIGG0C 240 

GCCGAGTGGA AACTTTTGTC GOAGAGGGAG AA6CX3GCGGT TCATCXAC6A GGCTAAGCGG 300 

CTGCX3AGCGC T6CACATGAA GGAGCACCCXS GATTATAAAT ACCGGCCXZCG GCGGAAAACC 360 

AAGACGCTCA TCAAGAAGGA TAAGTACAOG CTGCCCGGCG GGCTGCTGGC CCCCGGCGGC 420 

AATA6CAT66 CGAGG6GGOT CGGGGTGGGC GCCGGCCT6G G0GGGGGC6T GAACCAGCX3C 480 

ATGGACAGTT AGQOQCACAT GAA0GGCTG6 AGCAACG6CA GCTACAOCAT GATGCAGGAC 540 

CAGCTGGGCT ACCCQCAGCA CCCGGGCCTC AATGC6CACG GCGCA8CGCA GATGCAGCCC 600 

ATQCACCGCT ACGACGTGAG CGCCCTGCAG TACAACTCCA TGACCAGCTC GCAGACCTAC 660 

ATGAACX3GCT OGCCCACCTA CAGCATGTCC TACTOGCAGC AGGGCACCCC T6GCATGGCT 720 

CTTGGCTCCA TGGGTTCGGT GGTCAA6TCC GAGGCCAGCT CCAGCCCCCC TGTGGTTACC 780 

TCTTCCTCCC ACTCCAGGGC GCCCTGCCAG GCCGGGGACC TCC3GGGACAT GATCAGCATG 840 

TATCTCCCCG GCGCXX3AGGT GCCGGAACCC GCCGCCCCCA GCAGACTTCA CATGTCCCAG 900 

CACTACCAGA GOGGCCCGGT GCCCGGCAOG 6CCATTAA0G GCACACTGCC CCTCTCACAC 960 

ATOTQAGGQC 0GGACA6C6A ACTGGAGGGG 6GAGAAATTT TCAAA6AAAA ACGAGGGAAA 1020 

TGGGAOGGGT GCAAAAGAGG AOAGTAAGAA ACAGCATGGA GAAAACCCGG TAGGCTCAAA 1080 
AAAAA 

Seq ZD NO: 515 Protein sequence 
Protein Accession «: CAA83435 

1 11 21 31 41 51 

I 1 i I t I 

HSARMYNMME TELKPPGPQQ T86GGG6NST AAAAGGNQKN SPDRVKRPMN AFHVWSRGQR 60 

RRMAQEE3PKM HKSBZSKRLG AEWKLI^BTE KRPFIDEAKR LRALHHKBHP DYKYRPRRKT 120 

KTLMKKDKirr LPQGLIiAPGG NSHASGVGVO AGIiGAGVNQR MDSYAHMHGK SNGSYSMHQD 180 
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QLOyPQHPGL KAHGAAQMQP MHRTOVSALQ YNSMTSSQTY MNG5PTYSMS YSQQGTPGHA 240 

LOSMOSWKS SASSSPFWT SSSHSRAPGQ ASDLRDMISM YLFGAEVFEP AAFSRI1BH8Q 300 
HYQS6PVPGT AIKGTLPLSH M 

Seg ID NO: 516 DNA sequence 
Nucleic Acid Accession U916X8 
Coding sequence: 29.. 541 

1 11 21 31 41 SI 

I i I I I I 

CGGACTTGGC TTGTTAGAAG GCTOAAAGAT QATGGCAGGA ATGAAAATCC AGCTTGTATG €0 

CAMCTACTC CTG fl CTTTCA GCTCCTCGAG TCTGTGCTOl GATTCAGAAG AGGAAATGAA 120 

AGCATTAGAA GCAOATTTCT TOftCCAATAT GCATACATCA AAGATTAGTA AAGCACATGT 180 

TCCCtCTTGQ AAQATGACTC TGCTAAATGT TTGCAGTCTT GTAAATAATT TGAACAGCCC 240 

AGCTGAGGAA ACAGGAGAAG TTCATGAAGA GGAGCTTGTT GCAAGAAGGA AACTTCCTAC 300 

TGCTTTAGAT GGCTTTAGCT TGGAAGCAAT GTTGACAATA TACCAGCTCC ACAAAATCT6 360 

TCACAGCAGG GCTTTTCAAC ACTGGGAGTT AATCCAGQAA GATATTCTT6 ATACTGGAAA 420 

7GACAAAAAT GGAAA6GAAG AA6TCATAAA GAGAAAAATT CXTTTATATTC TGAAAC6GCA 480 

GCTGTATGAG AATAAACCCA GAAGACCCTA CATACTCAAA AGA6ATTCTT ACTATTACTG 540 

AGAGAATAAA TCATTTATTT ACATQTGATT GTGATTCATC ATCCCTTAAT TAAATATCAA 600 

ATTATATTTG TGTGAAAATG TGACAAACAC ACTTATCTGT CT CTTC TftCA ATTGTQOTTT 660 

ATTGAATOTG TTTTTCTOCSk CTAATAQAAA TTAGACXAAG TOTTTTCAAA lAAATCTAAA 720 
TCTTCAAAAA AAAAAAAAAA AAATGGGGOC 6CAATT 

Seq ID HO: 517 Protein sequence 
Protein Accession #: AAB50564 

1 11 21 31' 41 51 

I I I I I I 

MMAGHKZQLV CMLLLAFSSH SLCSDSSEBM KALBADFL^EN MRTSKISKAB VPSHXMTLIiIf 60 

VCSLVMNLKS PAEETGEVHE EBLVASRKLP TALDGFSIiEA HIiTIYQLHKI GHSRAFQPIB 120 
LIQEDILDTG NDKN6KEEVI KRKIPYILKR QIiYENKFRRP YXLKSDSYYY 

Seq ID NO: 518 DKA sequence 

Nucleic Acid Accession #t im_006536.2 

'Coding sequences 109.. 2940 

1 11 21 31 41 51 

I 1 I I I I 

A0CTAAAAOC TTGCAAGTTC AGGAAGAAAC CATCTGCATC CAXATTGAAA ACCTSACACA 60 

ATGTATGCA6 CAG6CTCA6T 6TGAGTGAAC TGGA6GCTTC TCTACAACAT GACCXAAAGG 120 

AGCATTGCA6 GTCCTATTTG CAACCTGAAG TTTGTGACTC TCCTGGTTGC CTTAAGTTCA 180 

GAACTCCCAT TCXTTGGGAGC TGGAC5TACAG CTTCAAGACA ATOGGTATAA TGGATTGCTC 240 

ATTGCAATTA ATCCTC3M3GT ACCTGAGAAT CAGAACCTCA TCTCAAACAT TAA6GAAATG 300 

ATAACTGAAO CTTCATTTTA CCTATTTAAT GCTACCAA6A QAAOAGTATT TTTGAGAAAT 360 

ATAAAGATTT TAATACCTGC CACATGGAAA GCTAATAATA ACAGCAAAAT AAAACAAGAA 420 

TCATATGAAA AQGCAAATGT CATAGTGACT GACTGGTATG GGGCACATGG AGATGATCCA 480 

TACACCCTAC AATACAGAGG GTGTGQAAAA QAGGQAAAAT ACATTCATTT CACACCTAAT 540 

TTCCTACTGA ATGATAACTT AACAGCTGGC TAOGGATCAC GAGGCCXSAGT GTTTGTCCAT 600 

GAATGG6CCC ACCTCCOTTG GGGXGT6TTC GATSU3TATA ACAATGACAA ACCTT TCTAC 660 

ATAAATGGGC AAAATCAAAT TAAAGTGACA AGGTGTTCAT CTGACATCAC AGGC ATTTTT 720 

GTGTGTGAAA AAGGTCCTTG CCCCCAAGAA AACTGTATTA TTAGT7U«5CT TTTTAAAGAA 780 

GGATGCACCT TTATCTACAA TACCPiCCCAA AATGCAACTQ CATCAATAAT GTTCATGCAA 840 

AGTTTATCTT CTGTGGTTGA ATTTTGTAAT GCAAGTACCC ACAACCAAGA AGCACCAAAC 900 

CTACAGAACC AQATGTGCAG CCTCAGAAST GCATGOQATG TAATCACAGA CTCT6CTGAC 960 

TTTCAiCCACA GCTTTCCCAT OAATGOQACT GAOCrTOCAC CTCCTCCCAC ATTCTCGCTT 1020 

GTACAGGCTG GT6ACAAAGT GGTCTGTTTA GTGCTGGATG TGTCCAGCAA GATGGCAGAG 1080 

GCTGACAGAC TCCTTCAACT ACAACAAQCC 6CAGAATTTT ATTTGATGCA GATTGTT6AA 1140 

ATTCATACCT TCX3TGGGCAT TGCCAGTTTC GACAGCAAAG GAGAGATCAG AGCCCAGCTA 1200 

CACCAAATTA ACASCAAT6A TQATCGAAAO TTGCTGGTTT CATATCTGCC CACCACTGTA 1260 

TCAGCTAAAA CAQACATCAG CATTTGTTCA GGGCTTAAGA AAGGATTTGA GGTGGTTGAA 1320 

AAACTGAATG GAAAAGCTTA TGGCTCTGTG ATGATATTAG TGACCAGCGG AGATGATAAG 1380 

CTTCTTGGCA ATTGCTTACC CACTGTGCTC AGCAGTGGTT CAACAATTCA CTCCATTOCC 1440 

CTGG6TTCAT CTGCAGCCCC AAATCTGGAG GAATTATCAC GTCTTACAGG AGGTTTAAAO 1500 

TTCTTTOTTC GAGATATATC AAACTCCAAT AGCATGATTG ATGCTTTCAG TAGAATTTCC 1560 

TCTGGAACTG GAGACATTTT CCAGCAACAT ATTCAGCTTG AAAGTACAG6 TQAAAATGTC 1620 

AAACCTCACC ATCAATTGAA AAACACAGTG ACTOTGGATA ATACTGTGG6 CftAOGACACT 1680 

ATGTTTCTA6 TTACXSTGGCA GGCXaVGTGGT CCTCCTGAGA TTATATTATT TGAT0CT6AT 1740 

GGACX3AAAAT ACTACACAAA TAATTTTATC ACCAATCTAA CTTTTOGGAC AGCTAGTCTT 1800 

TGGATTCCAG GARCAGCTAA GCCTGGGCAC TGGACTTACA CCCTGAACAA TACOCATCAT 1860 

TCTCTGCAAG CCCTGAAAGT GACAGTGACC TCTCGC3GCCT CX^ACTCAGC TGTGCCCCCA 1920 

GCCACTGTGG AAGCCTTTGT GGAAAGASAC AGCCTCCATT TTCCTCATCC TGTGATGATT 1980 

TATGCCAAT6 TGAAACAGGG ATTTTATOCC ATTCTTAATO CCACTGrCAC T6CCACAGTT 2O40 

GAGCCAGAGA CTGGAOATCC TGTTAC3GCT6 A6ACTCCTTG ATGATOGAGC AGGTGCTGAT 2100 

GTTATAAAAA ATGATGGAAT TTACTCGAGG TATTTTTTCT CCTTTGCT6C AAATGGTAGA 2160 

TATAGCTTGA AAGTGCATGT CAATCACTCT CCCAGCATAA GCACCCCAQC CCACTCTATT 2220 

CCAGGGAGTC ATOCTATGTA TGTACCAG6T TACACAGCAA AOGGTAATAT TCA6ATGAAT 2280 

6CTCCAAGGA AATCA6TAGG CAGAAATGAO GAGGM3GQAA AGTGGOGCTT TAGOCG AOTC 2340 

AGCTCAGGAG GCTCCTTTTC AGTGCTGGGA 6TTGCAGCTG GCCCCCACCC TGATGTGTTT 2400 

CCACCATGCA AAATTATTGA CCTGGAAGCT GTAAAAGTAG AAQAGGAATT GACCCTATCT 2460 

TGGACAGCAC CTGGAGAAGA CTTrGATCAG GGCCRGGCTA CAAGCTATGA AATAAGAATQ 2520 

AGTAAAAGTC TACAGAATAT CCAAGATGAC TTTAACAATO CTATTTTAGT AAATACATCA 2580 

AAGCX3AAATC CTCAGCAAGC TGGCATCAGG GAGATATTTA CGTTCTCACC OCAGATTTCC 2640 

AOGAATGGAC CTGAACATCA GCCAAAT66A GAAACACATG AAAGCCACAG AATTTATGrT 2700 

GCAATAOGAG CAATGGATAG OAACTCCTTA CAGTCTGCTG TATCTAACAT TGCCCAGGCG 2760 

CCTCT6TTTA TTCX:CCCCAA TTCTGATCCT GTACCTGCCA GAGATTATCT TATATTGAAA 2820 

GQAGTTTTAA CAGCAATGGG TTTGATA6GA ATCATTTGCC TTATTATAGT TSTGACACAT 2880 
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CATACTTTAA GCAQGAAAAA GAGA6CA6AC AAiQAAAGAGA ATOGAACAAA ATTATTATAA 2940 

ATAAATATCC AAAGTGTCTT CX:TTCTTA6A TATAAGACCX: ATGGCCTTOG ACTACAAAAA 3000 

CATACTAACA AAGTCAAATT AACATCAAAA CTGTATTAAA ATQCATTQA6 TtTTTGTACA 3060 

ATACAGATAA GATTTTTACA TGGTAGATCA AC3VATTCTTT TTGGGGGTAG ATTAGAAAAC 3120 

CCTTACACTT TC3GCTATGAA CAAATAATAA AAATTATTCT TTAAAGTAAT GTCTTTAAAG 3180 

GCAAAGGGAA GGGTAAAGTC GGACCAGT6T CAAGGAAAGT TT6TTTTATT 6AGGTGGAAA 3240 

AATAGCCCCA AiGCAGAGAAA AGGAGGGTAfi GTCTGCATTA TAACIGTCTG TGTGAAGC3kA 3300 

rCATTTAGTT ACTTTGATTA ATTTTTCTTT TCTCCTTATC TGTGCAGTAC AGGTTGCTTG 3360 

TTTACATGAA GATCATOCTA TATTTTATAT ATGTAGCCCC TAATGCAAAG CTCTTTACCT 3420 

CTTGCTATTT TGTTATATAT ATTTCAGATG ACATCTCCCT GCTAATGCTC AGAGATCTTT 3480 

TTTCACTGTA AGA6GTAACC TTTAACAATA TGGGTATTAC CTTTGTCTCT TCATACCGGT 3540 

TTTATQACAA AGGTCTATTG AATTTATTT6 TNTGTAAOTT TCTACTCCCA TCAAAGCAGC 3600 

TTTCTAAOTT TATTGCCTTG 6GTTAT7ATG GAATGATAGT TATAGCCCCM TATAAT60CT 3660 
TACCTAGGAA A 

Seq ID KO: 519 Protein sequence 
Protein Accession #i NP_006S27.1 

1 11 21 31 41 51 

I I I I I I . 

MTQRSIA6P1 CMLKPVTLLV ALSSEIiPPLG AGVQLQDNGY NGLLIAINPQ VPENQNLISM 60 

IKENITBASF YLFNATXSRV FFRNZKICiIP ATHKANMNSK IKQBSYEKAK VIVTDWYGAH 120 

(^PYTLQYR GCGKEGKYIH FTPNFLLNDN LTAGYGSRGR VFVHEHAHLR HGVFDBYMMD 180 

KPFYINGQNQ IKVTRCSSDI TGIFVCEKGP CPQBNCIISK LPKEGCTPIY NSTQNATASI 240 

MPMQSLSSW BFCNASTHNQ EAPMLQNQMC SLRSAWDVIT DSADPHHSFP MNGTBLPPPP 300 

TFSLVQAGDK WCLVLDVSS KMAEADRLLQ LQQAAEPYLM QIVEIHTFVG lASFDSKGEI 360 

RAQLHQINSN DDRKLI*VSYL PTTVSAKTDI SICSGLKKGF EWHKLNQKA yGSVMILVTS 420 

GDDKLLGNCL PTVLSSGSTI HSIALGSSAA PMX.EELSRLT G6LKFFVPDI SNSMSMIDAF 480 

8RISSGTGDI PQQHIQLBST GEKVKPHHQL KNTVTVDNTV GNDTHFLVTW QASGPPEIIL 540 

FDPDGRKYYT NNPITNXjTFR TASLWIPQTA KPCBEIMTYTm NTKHSLQALK VTVTSRASHS 600 

AVPPATVEAF VERDSLHFPH PVMIYANVKQ GFYPILMATV TATVEPBTGD PVTLRLLDDG 660 

AGADVIKNDG lYSRYFFSFA ANGRYSLKVH VNHSPSISTP AHSIPGSHAM YVPGYTANOJ 720 

IQMNAPRKSV GRNBEERKMG FSRVSSGGSP SVLGVPAGPH PDVFPPCKII DLEAVKVBEE 780 

LTLSWTAPGE DFDQGQATSY BIRMSKSLQN IQDDFNNAIL VNT8XRNPQQ AGIREIFTFS 840 

PQI8TN6PEH QPNGETHESH RZYVAISAMD RKSLQSAVSN lAQAPLFIPP NSDPVPARDY 900 
- liILKGVIiTAM GLIGIICIiII WTHHTLSRK KRADKKEHGT KUt 

Seq ID NOs 520 DNA sequence 

Nucleic Acid Accession #t NM_000228.1 

Coding sequence: 82.. 3600 

1 11 21 31 41 51 

I 1 I I 1 ) 

GCTTTCAGGC GATCTGGAGA AAGAACGQCA GAACACACAG CAAGGAAAGG TCCTTTCTGG 60 

G6ATC31CCOC ATTGGCTGAA GATGAGACCA TTGTTCCTCT TGTGTTTTGC CCTGCCTQGC 120 

CTCCTGCATG CCCAACAAOC CTQCTCCOQT GOGGCCTGCT ATCCACCT6T TGGGGACCTG 180 

CTTGTTGGQA GGACCCGGTT TCTCCQAGCT TCATCTACCT GTGGACTGAC CAAGCCTGAG 240 

ACCTACTGCA CCCAGTATGG 0GAGTGQCA6 ATQAAATGCT GCAAGTGT6A CTCCAGGCAG 300 

CCTCACAACr ACTACAGTCA C0QA6TAGAG AATGTGGCTT CATCCTCGGG CCCCATGCGC 360 

TGGTG6CAGT CCCAGAAT6A TCnOAACCCT GTCTCTCTGC AGCTGGACCT GGACAOGAfiA 420 

TTCCAGCTTC AAGAAGTCAT GATG6A6TTC CAGGGGCCCA T6CCCGCGGG CATGCTGATT 480 

GAGC6CTCCT CAGACTTCGG TAAGACCTGG CGAGTGTACC AGTACCTGGC TGCCGACTGC 540 

ACCTCCACCT TCCCTCGGGT CCGCCAGGGT OSGCCTCAGA QCTGGCAGGA TGTTCGGTGC 600 

CAGTCCCTGC CTCA6AGGCC TAATGCACX3C CTAAATGGGG GGAAGGTCCA ACTTAACCTT 660 

ATGGATTTA6 TGTCTGGGAT TCXAGCAACT CAAAGTCAAA AAA3TCAAGA GGT6GG6GAG 720 

ATCACAAACT T6AGA6TCAA TTTCACCAGG CrGGCX:GCTG TGCCCCAAA6 GGGCTACCAC 780 

CCTCCCAGCG CCTACTATGC TGTGTCCCAG CTCCGTCTGC AOOGGAOCTG CTTCTGTCAC 840 

GGCCATGCTG ATCGCTGCGC ACCCAAGCCT GGGGCXTTCTG CAGGCCCCTC CACCGCTGTG 900 

CAGGTCCACG ATGTCTQTGT CTGCCAGCAC AACACTGCCG GCCCAAATTG TGAGCX3CTGT 960 

GCACCXTTTCT ACftACAACOQ GCCCTGOAGA COQGGGGAGG GCCAGGAC6C CCATGAAT6C 1020 

CAAAGGTGOG ACTGCAATQG GCACTCAQAO ACATGTCaCT TTGACCCCS3C TGTGTTTGCC 1080 

GCCAGCCAGG GGGCATATGG AGGTGTGTGT GACSU^TTGCC GGGACCACAC CGAAGQCAAG 1140 

AACTGTGAGC GGTGTCAGCT GCACTATTTC CG6AACCGGC GCCCGGQAGC TTCCATTCAG 1200 

OAGACCTGCA TCTCCTGCGA GTGTGATCCG GATGGGGCAQ TGCCAGGGGC TCCCTQTGAC 1260 

CCAQTGAOGG QGCAGTGTGT GTGCAAGGAG CATGTGCAGG GAGAGCGCTG TGACCTATGC 1320 

AAOCOGGGCT TCACTGGACT CACCTACGCC AACCOGCAGG GCTGCCACCG CTGTGACTGC 1380 

AACATCCTGG GGTCCCGGAG GGACATGCCG TGTGACGAGG AGAGTG6GCG CTGCCTTTGT 1440 

CTGCCCAACG TGGTGGGTCC CAAAT6TGAC CAGTGTGCTC CCTACCACTG GAAGCTGGCC 1500 

AGTGGCCAGG GCTGTGAACC GTGTGCCTGC GACCCGCACA ACTCCCCTCA GCCCACAGTG 1560 

CAACCAGTTC ACAGGGCAGT GCCCTGTCGG GAAGGCTTTQ GTGGCXnXSAT GTGCAGCGCT 1620 

GCAGCCATCC GCCAGTGTCC AGACCGGACC TATGGAGACG TGGCCACAGG ATGCCGAGCC 1680 

TGTGACTGT6 ATTTCOGGGG AACAGAGGGC CCX3GQCT6CG ACAAGGCATC AGGCCGCTGC 1740 

CTCTQCGGCC CTGGCTTQAC OOGGCCCGGC TGT6ACCABT GCCAGCOAGG CTACTGCAAT 1800 

CGCTACCXX3G TGTGCGTGGC CTGCXaCCCT TGCTTCCAGA CCTATGATGC GGACCT0CX3G 1860 

GAGCAGGCCC TGOGCTTTGG TAGACTCCX5C AATGCCACCG CCAGCCTGTG GTCAGQQCCT 1920 

GGGCTGGAGG ACGGTGGCCT GGCCTCCCGG ATCCTAGATG CAAAGAGTAA 6ATTGAGCAG 1980 

ATCCGAGCAG TTCTCAGCAG CCCC6CA6TC ACAGAGCAGG AGGTGGCTCA GGT6GCCAGT 2040 

GCCATCCTCT OOCTCAGQGG AACTCTCCAO GGCCTGCAGG TG(SVrCTGOC CCTG6A6GA0 2100 

GA6ACX3TTGT CCCTTCC6AG AGACCTGGAG AGTCTTGACA GAAGCTTCAA TG6TCTCCTT 2160 

ACTATGTATC AGAG6AAGAG GGAGCAGTTT GAAAAAATAA GCAGTX3CTGA TCCTTCAGGA 2220 

GCCTTCCX5GA TGCTGAGCAC AGCCTACGAG CAGTCAGCCC AGGCTGCTCA GCAGGTCTCC 2280 

GACA6CTC6C GCCTTTT6GA CCA6CTCAGG 6ACAGCCX3GA GAGAGGCAGA GAGGCTGGTG 2340 

GGGCAGGOSG QAG6A6GAG6 AGGCAGCX36C AGCXXX3UU3C TTGTG6CCCT GAGGCTGGAG 2400 

ATGTCTTOGT TGCCTGACCT GACACCCACC TTCAACAAGC TCTGTGGC3UV CTCC3«a3CA6 2460 

ATGGCTTQCA CCCCAATATC ATGCCCTGGT GAGCTATGTC CCOUtfSACAA T6GCACAGCC 2520 

TGTGGCTCCC GCTGCAGGGG TGTCCTTCCC AGGGCCGGTG GGGCCTTCTT GATGGCGGGG 2580 

CAGGTGGCTG AGCAGCTGCG GGGCTTCAAT GCCCAGCTCC AGOGGACCAG GCAGAT6ATT 2640 
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AGGGCAGCOQ AOQAATCTQC CTCACAQATT CAATCCAGTQ 0CCAGCX3CTT GGAGACCCAG 2700 

QTGAGOGCOV OCOQCTCCCA Gj^TGOAGOAA GATGTCAGAC QCACAGGGCT CCTAATCCAa 2760 

CAOOTCOOGG ACTTCCTAAC AGACCC03AC ACTGATGCAO OCACTATCCA GGAGGTCAGC 2820 

GAQGC06T6C TGGGCCTGTG 6CTGCCCACA GACTCAGCTA CTGTTCTGCA GAAGAT6AAT 2880 

GAGATCCAGG CCATTGCAGC CAGGCTCCCC AAGGTGGACT TGGTGCtGTC CCAGACCAA6 2940 

CAGGACATTG CGCGT6CX06 CCX3GTTGCAG GCTGAGGCT6 AGGAAGCCAG GAGCC6AGCC 3000 

CATGCAGTGG AGG6CCAGGT GGAAGATGTG GTTGGGAACC TGCGGCAGGG GACAGTGGCA 3060 

CTGCAG6AAG CTCAGGACAC CATGOUWSGC ACCAGCCGCT CCCTTCGGCT TATCCAGGAC 3120 

AGGGTTGCTG AGGTTCA6CA GGTACTGCGG CCAGCAGAAA AGCTGGTQAC AAGCATGACC 3180 

AAGCAGCTGG GTGACTTCTG GACACGGATG QAGGA6CTCC GGCACCAA6C CCXSGCAOCAO 3240 

GGGGCAGAGG CAGTCCAGGC CCAGCAGCTT GCGGAAGGTG CCAGCX3AGCA GGCATTGA6T 3300 

GCCCAAGAGG GATTTQAGAG AATAAAACAA AAGTATGCTG AGTTQAAGGA CCGGTTGGGT 3360 

CAGAGTTCXIA TGCTGGGTGA GCAGGGTGCC OGGATCCAGA GTGTQAAGAC AGAGGCAGAG 3420 

GA6CT6TTTG GGGAQACCAT QGAGATGATO GACAGGATGA AAGACATGGA GTTGGAGCTG 3480 

CTGGGGGQCA 6CCAGGCCAT CAT6CTGCGC TGGOOQGACC TGACAG6ACT G6AGAA606T 3540 

QTGGAGCAGA TCCGTGACCA CATCAAT6GG CG06T6CTCT ACTATGCCAC CTGCAAGIGA 3600 

T6CTACAGCT TCCAGCCCGT TGCCCCACTC ATCTGCOGCC TTTGCTTTTG GTTGGGGGCA 3660 

GATTGGGTTG GAATGCTTTC CATCTCXAGG AGACTTTCAT GCAGCCTAAA GTACAGCCTG 3720 

GACCACXrCCT GGTGTGTAGC TAGTAAGATT ACCCT6A6CT GCAGCT6A6C CTGAGCCAAT 3780 

GOGACAGTTA CACTTGACAG ACAAAGATGO T66A0ATTGG CATGCCATTO AAACTAAGAG 3840 

CTCTCAAGTC AAGGAA6CTG G6CIGG6CAG TATOOOCOGC CTTTAGTTCT GCACIGGGGA 3900 

GGAA3CCTGG ACCAAGCACA AAAACTTAAC AAAAGTGATG TAAAAATGAA AAGCCAAATA 3960 
AAAAICTTT6 6 

Seq ID NO: 521 Protein sequence 
Protein Accession #: NP_000219.]. 

1 IX 21 . 31 * 41 51 

I 11 I I I 

MRPFFXiLCFA LPGUiBAQQA CSRQACYPPV GDIiLV(3l1SF LRASSTOGLT KPEIYCTQYG 60 

EWQMKCCKCD SRQPBNYYSH RVENVASSS6 PMRWHQSQHD VNPVSLQLDL DRRFQLQBVM 120 

MEPQGPMPAG MLIERSSDPG KTWRVyQYLA ADCTSTFPRV RQGRPQSWQD VRCQSLPQRP 180 

NARLNGGKVQ LNLMDLVSGI PATQSQKIQE VGEITNLRVN PTRIiAPVPQR GYHPPSAYYA 240 

VSQLRLQGSC FCHGHADRCA PKPGASAGPS TAVQVEDVCV OQHNTAGPNC ERCAPFYKNR 300 

FWRPAEGQDA HBOQRCDCNG HSETCXFDPA VFAASQGAYG 6VCDNC3U3HT EGKHCEROQL 360 

HYFRNRRPGA SXQBTCI8CB CDPDGAVP6A PCDFVTGQCV CKBBVQGERC OLCKPGPTGL 420 

TYAIfPQGCHR CDOStJl/SStai DMPCDEESGR CLCLSNWGP KCDQCAPYHK KZiASGOGCEP 480 

CACDPHNSPQ PTVQPVHRAV PCRB6FGGLM CSAAAIRQCP DRTY(S>VATQ CRACDCDFRG 54 0 

TEGPGCDKAS GRCLCRPGLT GPRCDQCQRG YCHRYPVCVA GHPCFQTYDA DLREQALRFG 600 

RLRNATASLW SGPOLEDROL ASRILDAKSX lEQZSAVLSS PAVTEQEVAQ VASAZLSLRR 660 

TI^QQLQLDLP LEEETLSLPR DLESLORSFN GLLTIfYQRRR EQFBRZSSAD PSGAFRMLST 720 

AYEQSAQAAQ QVSOSSRLLD QLRDSRREAE RLVRQAGGGG GTGSPKLVAL RLEMSSLPDL 780 

TPTPNKLCGN SRQMACTPIS CPGELCPQDN GTACGSRCRG V1J>RAGGAPL MAGQVAEQLR 840 

GPNAOLQRTR QMIRAAEESA SQIQSSAQRIi ETQV8ASRSQ MSEDVRRTRL LIQQVRDFLT 900 

DPOTDAATIQ BVSEAVZiALH LPTDSATVLQ XMNBIQAZAA RI1PNVDI1VI18 QTKQOZARAR 960 

RLQAEABEAR 8RAHAVEGQV EDWGNLRQG TVALQEAQDT MQGTSRSIiRIi XQDRVAEVQQ 102 0 

VLRPAEKLVT SMTKQLGDFW TRMEBLREQA RQQGAEAVQA QQLABQASEQ ALSAQB8PER 1080 

IKQKYAELKD RLGQSSMLGE QGARIQSVRT BAEELFGBXM EMMDRMKDME LELIiRGSQAI 1140 
MLRSADLTGL EKRVEQIRDH IKGRVLYYAT CK 

Seq ID N0( 522 DKA sequence 

Nucleic Acid Accession #: 12N_001944.1 

Coding aequencei 84.. 3083 

1 11 21 31 41 51 

I I I I I i 

TTTTCTTAGA CATTAACTGC AGAC3GGCTGG CAGGATAGAA GCAGOGGCTC ACTTGGACTT 60 

TTTCACCAGG GAAATCAGAG ACAATGATGG GGCTCTTCCC CAGAACTACA GGGGCrCTGG ' 120 

CCATCTTCGT GGTGGTCATA TTGGTTCATG GAGAATTGCG AATAGAGACT AAAGGTCAAT 180 

ATGATGAAGA AGAGATGACT AT6CAACAAG CTAAAAGAAG GCAAAAACX3T GAATGGGTGA 240 

AArrTGCCAA ACCXTOCAQA 6AAG6AGAAG ATAACTCAAA AAGAAACCCA ATTGCCAAGA 300 

TTACTTCAGA TTACCAAGCA ACCCAGAAAA TCACCTAGOG AATCTCTGGA GTGGGAATOG 360 

ATCAQCCTGCC TTTTOGAATC TTTGTTGTTG AGAAAAACAC TOGAGATATT AACATAACAG 420 

CTATAGTCGA CXX»36AGGAA ACTGCAAGCT TCCTGATCAC ATGTC6GGCT CTAAAT6C0C 480 

AAGQACTAGA TGTAQAGAAA CCACTTATAC TAAOSGTTAA AATTTTGGAT ATTAATGATA 540 

ATCCTCCAGT ATTTTCACAA CAAATTTTCA TGGGTGAAAT TGAAGAAAAT AGTGCCTCAA 600 

ACTCACTGGT GATGATACTA AATGCCACAG AT6CAGAT6A ACCAAACCAC TTGAATTCTA 660 

AAATT60CTT CAAAATTGTC TCTCAOGAAC CAGCAGGCAC ACCCATGTTX: CTCXTTAAGCA 720 

GAAACACTGG GGAAGTCCGT ACTTTGACCA ATTCTCTIGA COGAGAGCAA GCTAGCAGCT 780 

ATCGTCTGGT TGTGAGTGGT GCAGACAAAG ATGGAGAAGG ACTATCAACT CAATGTGAAT 840 

GTAATATTAA AGTGAAAGAT GTCAAOSATA ACTTCCCAAT GTTTAGAGAC TCTCAGTATT 900 

CAGCA06TAT TGAAGAAAAT ATTTTAAGTT CTGAATTACT TCGATTTCAA GTAACAGATT 960 

TGGATGAAGA GTACACAGAT AATTG6CTT6 CAGTATATTT CTTTACCTCT GGGAATGAAG 1020 

GAAATTGGTT TGAAATACAA ACTGATCCTA GAACTAATGA AGGCATCCTG AAAGTGGTGA 1080 

AGGCTCTAGA TTATGAACAA CTACAAAGOO TGAAACTTAG TATTGCTGTC AAAAACAAAG 1140 

CTGAATTTCA CCAATCAGTT ATCTCTCGAT ACCGAGTTCA GTCAACOCCA GTCACAATTC 1200 

AGGTAATAAA TGTAAGAGAA GGAATTGCAT TCCGTCCTGC TTCCAAGACA TTTACTGTGC 1260 

AAAAAGGCAT AAGTAGCAAA AAArTGGTGG ATTATATCCT 6G6AACATAT CAAGCCATOG 1320 

AT6A6GACAC TAACAAAGCT GCCTCAAATG TCAAATATGT CATOG6ACX3T AACGATGGT6 1380 

6ATACCTAAT GATTGATTCA AAAACTGCTG AAATCAAATT TGTCAAAAAT ATGAACXX3AG 1440 

ATTCTACTTT CATAGTTAAC AAAACAATCA CAGCTGAGGT TCTGGCCATA GATGAATACA 1500 

CGGGTAAAAC TTCTACAGGC ACGGTATATG TTAGAGTACC CGATTTCAAT GACAATTGTC . 1560 

CAACA6CTGT CCTG6AAAAA GAT6CAGTTT GCAGTTCTTC AOCTTCOOTG GTT6TCTCC8 1630 

CTAGAACACT QAATAATA6A TACACTGGCC CCTATACATT TGCACTG6AA GATCAACCTG 1680 

TAAAGTT6CC TGCCGTATGG AOTATCACAA CCCTCAATGC TACCTCGGCC CTCCTCaGAG 1740 

CCCAGQAACA GATACXTTCCT GGAGTATACC ACATCTCCCT GGTACTTACA GACAGTCAGA 1800 

ACAATCX3GTG TGAGATGCCA GGCAGCTTGA CACTGGAAGT CTGTCAGTGT GACAACAGGG 1860 
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GCATCTGTGG AACTTCTTAC CXaACCACAA GCCCT GGGRC 
CAGGGAGGCT GGGGCCTGCC GCCATCGGCC TGCTGCTCCT 
T6GCCCCCCT TCTGCTGTTG ACCTGTGACT GTGOGGCAGO 
OTOOTTTTAT CCCAGTTCCT GATGGCTCAG AAGGAACAAT 
GAGCCCATCC TGAAGACAAG GAAATCACAA ATATTTGTGT 
GAGCCGATTT CATGGAAAGT TCTQAAGTTT GTAOVAATAC 
TGGAAGGCAC TTCAGGAATG GAAATGACCA CTAJM3CTTGG 
GTGCTGCAGG CTTTGCAACA GGGACAGTOT CAGGAGCTGC 
CTGGAGTTG6 CATCT6TTCC TCAGGGCAGT CTGGAACCAT 
GAGGAACCAA TAAGGACTAC GCTGATGGGG CGATAAGCAT 
TTTCTCAQAA AGCATTTGCC TGTGOSGAGG AAGACX»TGG 
TGrTGATCTA TOATAATGAA GGGGCAOATG CCACTGQTTC 
GTTGCAGTTT TATTGCTGAT GACCTGGATG ACAGCTTCTT 
TTAAAAAACT TGCAGAGATA AGCCTTGGTG TTGAT6QTGA 
CCTCTAAAGA CA6CQGTTAT GGGATTGAAT CCTGTGGCCA 
CAGGATTTGT TAAGTGCCA6 ACTTTGTCAG GAAGTCAAGG 
CTGGGTCTQT CCAGCCAGCT GTTTCCATCC CTGACCCTCT 
TAACGGAGAC TTACTCG6CT TCTGGTTCCC TCSTGCAACC 
CACTTCTCAC ACAAAATGTG ATAGTGACAG AAAGGGTGAT 
CTGGCAACCT AGCTGGCCCA ACX5CAGCTAC GAGGGTCACA 
ATCCTTGCTC COGTCTAATA TGACCAOAAT GA6CTGGAAT 
ATCTTTGOAC TAAAGTATTC AAAATAOCAT AGCAAAGCTC 
TOGCACTTAT TAGCTTCTCT CATAAACTGA TCACQATTAT 
TACOGCAAAA OCAATATOTT GTCACTCCTA ATTCTCAAGT 
TCTTAAAGTT TTTCAAAACC CTAAAATCAT ATTOQC 

Seq ID NO: 52 3 Protein sequence 
Protein Accession ft IirP_001935 . 1 
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CAGGTATGGC 
TG6TCTGCTG 
TTCTACTGGG 
TCATCAGTGG 
GCCrCCTGTA 
GTATGCCAGA 
AGCA6CCACT 
TTCAGGATTC 
GAGAACAAGG 
GAATTTTCTG 
CCAGGAAGCA 
TCCT6TG6GC 
GGACTCACTT 
AGGCAAAGAA 
TCCCATAGAA 
AGCTTCTGCT 
6CAGCATGGT 
TTCCACTGCA 
CTGTCCCATT 
TACTATGCTC 
ACCACACTGA 
ACTGTATTG6 
AAATTAAATG 
ACTATTCAAA 



AGGC06CACT 
CTGCTGCTGT 
GGAGT6ACAG 
GGAATTGAAG 
ACAGCCRATG 
GGCACAGC6G 
GAATCT6GAG 
GGAGCAGCCA 
CATTCCACTG 
GACTCCTACT 
AATGACTOCT 
TCCGTGGGTT 
GGACCCAAAT 
GTTCAGCOIC 
GTCCAGCAGA 
TTGTCCGCCT 
AACTATTTA6 
GGCTTTGATC 
TCCAOTGTTC 
TGTACAGAGQ 
CCAAATCTGG 
GCTAATAATT 
TTTGGGTTCA 
TTGTAGTAAA 



1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 



1 
I 

GEDNSKRNPI 
PSFIiITCRAL 
ATDADBPNHL 
DKDGEOIiSTQ 
HIAVYFFTSG 
SRYRVQSTPV 
SNVKYVMGRN 
VYVRVPDFND 
ITTUXATSAL 
TTSPGTRYQH. 
GSBGTIHQNG 
MTTKLGAATE 
DOAZSMNFLD 
LDDSFLDSL6 
LSGSQGASAL 
VTBRVICPIS 



11 
I 

AIAIFVWXL 
AKITSDYOAT 
NAQGLCfVBKP 
NSKIAPKIVS 
CBCNIKVKEV 
NEGNHFEIQT 
TIQVINVREG 
DGGYLMIDSK 
NCPTAVLBKD 
LHAQEQIFPG 
PHSGRLGPAA 
lEGAEPEDKE 
SGGAAGFAT6 
SYFSQKAFAC 
PKFKKLAEIS 
SASGSVQPAV 
SVPGKLA6PT 



21 

I 

VHGELRIETK 
QKITYRISGV 
LILTVKILDI 
QEPAGTPMFL 
HDNFPMFROS 
DPRINB6ZLK 
ZAFRPASKTF 
TAEIKFVKNM 
AVCSSSPSW 
VYHISLVLTD 
IGLLLLGLLL 
ITNICVPPVT 
TVS6AASGFG 
AEEDDGQEAN 
LGVDGEGKEV 
SIPDPLQHOH 
QUtGSHIMLC 



31 



I 

GQYDEE^ITM 
GIDQPPFGIP 
NDNPPVPSQQ 
LSRNTGEVRT 
QYSARIEENZ 
WKALDYEQL 
TVQKGISSKK 
NHDSTFIVNK 
VSARTIJniRY 



LLLAPLXiLIiT 
ANGADFMESS 
AATGVGIC3S 
DCIibiyDNEO 
QPPSKD8GYG 
YLVTETYGAS 
TBDPC8RLI 



41 

I 

QQAKRRQKRE 
WDKNTGDIN 
IFMGSIBENS 
LtNSLDREQA 
LSSEIiLRFQV 
QSVKLSIAVK 
LVDYIliGTYQ 
TITAEVIAID 
TGPYTPALED 
SLTLEVCQCD 
CDCGAGSTGG 
EVCTNTYARG 
GQSGTMRTRH 
AZXATG8FVG8 
IB8CGHPZBV 
OSLVQPSTAG 



51 
I 

WVKPAKPCRE 
ITAIVDREET 
ASNSLVMILH 
SSYRLWSOA 
TDIiDEEYTDN 
NKAEFHQSVI 
AIDKDTNKAA 
EYT6KTSTGT 
QFVKLPAVHS 
URGICGTSYP 
VTGGFIPVPD 
TAVEGTSGME 
STGGTNiOJYA 
VGCCSFIADD 
QQTOFVKCQT 
PDPMiTQHVr 



Seq ID NO: 524 DNA sequence 

Nucleic Acid Accession #: XM_058069.2 

Coding sequence: 1..1413 



1 

1 

ATGAAGTTTC 
AGCTCTACAA 
TATGGCCTTG 
AA6GAAAAAA 
ACATCTAOCC 
AGGGAAATGC 
TACACACCTG 
TGGAGTAATQ 
GTGGTTTTTG 
CTAQCCCATG 
GAATTCTGGA 
GGCCATTCCT 
AAATATGTTG 
CTGTATGGAG 
CTCTGTQAOC 
TTCAAA6ACA 
ATTTCTTCCT 
AGAAATCAAG 
GAGCCAAATT 
GATGCAGCTG 
T08A6GTATG 
AACTTCCAAG 
TATTTCTTCC 
ACACTGAAAA 



11 
I 

TTCTAATACT 
GCCTGGAAAA 
AGATAAACAA 
TCCAAGAAAT 
TGGAGAT6AT 
CA6G6GGGCC 
ACATGAACCX3 
TTACCCCCTT 
CCCGTGGAGC 
CTTTTGGACC 
CTACACATTC 
TAGGTCTTGG 
ACATCAACAC 
ACCCAAAAGA 
CCAATTTGA6 
GQTTCTTCTG 
TATGOCCAAC 
TTTTTCTTTT 
ATCCCAAGAG 
TTTTTAACCC 
ATGAAAGGAG 
GAATCX3GGCC 
AAGGATCTAA 
GCAATA6CTG 



21 
I 

6CTCCTGCAG 
AAATAATGTG 
ACTTCCAGTG 
GCAGCACTTC 
6CAC3GCAOCT 
CGTATGGAGG 
T6AGGATGTT 
GAAATTCAGC 
TCATGGAGAC 
T6GATCTG6C 
AGGAGGCACA 
CCATTCTAGT 
ATTTGGCCTC 
GAACCAACX3C 
TTTTQATGCT 
6CTGAA6GTT 
CTTGCCATCT 
TAAAGATGAC 
CATACATTCT 
ACGTTTTTAT 
ACAGATGATG 
TAAAATTGAT 
CCAATTTQAA 
GTTTGGTTGT 



31 



41 



6CCACTGCTT 
CTATTTGGTG 
ACAAAAATGA 
TTGGGTCT6A 
CGATGTGGM3 
AAACATTATA 
GACTACGCAA 
AAGATTAACA 
TTCCATGCTT 
ATTGGAGGGG 
AACTTGTTCC 
GATCCAAAGG 
TCTGCTGATG 
TTGCC3UUITC 
GTCACTACCG 
TCTGAGAGAC 
GGCATTGAAG 
AAATACTGGT 
TTTGGTTTTC 
AGGACCTACT 
QACCCTGQTT 
GCAGTCTTCT 
'SKSGhCSTCC 
TGA 



ctggagctct 

AAAGATACTT 
AATATA6TGG 
AAGTGACCGG 
TCCCCQATGT 
TCACCTACAG 
TCCGGAAAGC 
CAGGCATGGC 
TTGATG6CAA 
ATGCACATTT 
TCACTGCTGT 
CC6TAATGTT 
ACATACGTGG 
CTGACAATTC 
TGGGAAATAA 
CAAAGACCAG 
CTGCTTATGA 
TAATTA6CAA 
CTAACTTTGT 
TCTTTGTAGA 
ATCCCAAACT 
ACTCTAAAAA 
tACTGCAACG 



51 

I 

TCCCCTGAAC 
AGAAAAATTT 
AAACTTAATG 
6CAACTGGAC 
CCATCATTTC 
AATCAATAAT 
TTTCCAAGTA 
TGACATTTTG 
AGGT6GAATC 
CGATGAGGAC 
TCACGAGATT 
CCCCACCTAC 
CATTCAQTCC 
AGA ACCAG CT 
GATCTTTTTC 
TGTTAATTTA 
AATTGAA6CC 
TTTAAGACCA 
GAAAAAAATT 
TAACCAGTAT 
GATTACCAAG 
CAAATACTAC 
TATCACCAAA 



Seq ID NO: 525 Protein sequence 
Protein Accession ft: P39900 

11 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 



31 



41 



51 



1 11 21 ^ 1 I 

MKFU*IliI»IiQ ATASGALPLN SSTSLEKNNV LPGERYLEKP YGLEINKLPV TKMKYSGNLM 
KBKIQEMQHP LGLKVTGQLD TSTLEMMHAP RCGVPDVHHF RBMPGGPVWR KHYITYRINR 



60 
120 



385 
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WFARGAHGD 


FHAFDGKGGI 


180 


GHSXiGLGHSS 


DPKAVMPPTY 


240 


LCDPNLSFDA 


VTTVGKKIFF 


300 


RNQVPLPKDD 


KYKLISNLRP 


360 


HRYDERKQMM 


DPGYPKLITK 


420 


TLKSNSWFGC 







Seq ID HO: 526 DKA sequence 

17uclelc Acid Accession ft: 13M_024423.1 

Coding sequences 64.. 2590 

1 11 21 31 41 51 

I I I I I I 

G6CAGGTCTC GCTCTOGGCA CCCTCCCGGC GCCCGOGTTC TCCTGGCCCT GCCOSGCATC 60 

CC5C3ATGGCCG CCGCTGGGCC COGGOGCTCC GTGOGCGOAO CCCTCTGCCT GCATCTGCTQ 120 

CTGACCCTCO TGATCTTCAG TOGTGATGGT GAAGCCTGCA AAAA6GTGAT ACTTAATGTA 180 

CCTTCTAAAC TAGAGQCAOA CAAAATAATT GGCAGAGTTA ATTT6GAAQA GTGCTTCAGG 240 

TCTQCAGACC TCATC0C3GTC AAGTGATCCT GATTTCAGAG TTCTAAATGA TQGGTCAGTG 300 

TACACAGCCA GGGCTGTTQC GCTGTCTGAT AAGAAAAGAT CATTTACCAT ATGGCTTTCT 360 

GACAAAAGGA AACAGACACA GAAAGAGGTT ACTGTGCTGC TA6AACATCA GAASAAGOTA 420 

TCQAAGACAA GACACACTAG AGAAACTGTT CTCAG6CGTG CCAAGAGGAG ATGGGCAOCT 480 

ATTCCTTGCT CTATGCAAGA GAATTCCTTG GGCCCTTTCC CATTGTTrCT TCAACAAGTT 540 

GAATCTGATG CAGCACAGAA CTATACTGTC TTCTACTCAA TAAGTGGACG TGGAGTT6AT 600 

AAAQAACCTT TAAATTTGTT TTATATAGAA AGA6ACACTG GAAATCTATT TTGCACTOGG 660 

CCTGTGGATC GTGAAGAATA TGATOTTTTT GATTTQATTG CTTATGOOTC AACTOCAGAT 720 

G6ATATTCAG CaGATCTGCC CCTCCGACTA CCCATCAGGO TASAGQATGA AAATGACAAC 780 

CACCCTGTTT TCACAGAAGC AATTTATAAT TTTGAAGTTT TGGAAAGTAG TAGACCTGGT 840 

ACTACAGTGG GGGrGGTTTG TGCCACAGAC AGAOATGAAC CGGACACAAT GCATACGOGC 900 

CTGAAATACA GCATTTTOCA GCAGACACCA AGGTCACCTG GGCTCTTTTC TGTGCATCXX: 960 

A6CACAGGCQ TAATCAOCAC AOTCTCTCAT TATTTGGACA GAOAGGrrGT AGACAAGTAC 1020 

TCATTGATAA TGAAAGTACA AGACATGGAT G6CCRGTTTT TTGGATTGAT AGGCACATCA 1080 

ACTTGTATCA TAACAGTAAC AGATTCAAAT GATAATGCAC CCACTTTCAG ACAAAATGCT 1140 

TATGAAGCAT TTGTAGAGGA AAATGCATTC AATGTGGAAA TCTTACGAAT ACCTATAGAA 1200 

GATAA6GATT TAATTAACAC T6CCAATTGG AGAGTCAATT TTACC3VTTTT A AAGGG AAAT 1260 

GAAAATGGAC ATTTCAAAAT CAGCAGAQAC AAAGAAACTA ATOAAGGTGT TCmCTOTT 1320 

GTAAA6CCAC TCAATTATGA AGAAAACOGT CAAGTGAACC TGQAAATT6G A6TAAACAAT 1380 

GAAOaSCCAT TT6CTAQAQA TATTCCCAGA GTGACAOCCT TGAACAQAGC CTTGGTTACA 1440 

GTTCATGT6A GGGATCTGGA TGAGGGGCCT GAATGCACTC CTGCAGCCCA ATATGTGCGG 1500 

ATTAAAGAAA ACTTAGCAGT GGG6TCAAAQ ATCAACX30CT ATAAGGCATA TGACCCCGAA 1560 

AATAGAAATG GCAATG6TTT AAGGTACAAA AAATT6CATG ATCCTAAAGG TTGGATCACX: 1620 

ATTGATGAAA TTTCAGGGTC AATCATAACT TCCJUUttTCC TGGATAGG6A 6GTTQAAACT 1680 

CCCAAAAATG AGTTGTATAA TATTACAGTC CTGGCAATAG ACAAAGATGA TAGATCATGT 1740 

ACTGGAACAC TTGCTGT6AA CATTGAAGAT GTAAATGATA ATCCACCAGA AATACTTCAA 1800 

GAATATGTAG TCATTTQCAA ACCAAAAATG GGGTATACCG ACATTTTA6C TGTTQATCCT 1860 

OATQAACCTG TCCATGGAGC TOCATTTTAT TTCSM3TTTGC CCAATACTTC TCCAGAAATC 1920 

AOTAQACTGT GGAGCCTCAC CAAAOTTAAT GATACAOCTO CCCGTCTTTC ATATCAGAAA 19 BO 

AATGCTGGAT TTCAAGAATA TACXaVTTCCT ATTACTGTAA AAGACAGGGC CGGCCAAGCT 2040 

GCAACAAAAT TATTGAGAGT TAATCTGTGT GAATGTACTC ATCCAACTCA GTGTCGTGCG 2100 

ACTTCAAGGA GTACAGGAGT AATACTTGQA AAATGGGCAA TCCTTGCAAT ATTACTGGGT 2160 

ATAGCACTGC TCrTTTCTGT ATTGCTAACT TTAGTATGTQ GAGTTTTTGG TGCAACTAAA 2220 

GGGAAACSSTT TTCCTGAAGA TTTAGCACAG CAAAACTTAA TTATATCAAA CACAGAAGCA 2280 

CCTQGAGAOG ATAGAGTGTG CTCTGCCAAT GGATTTATGA CCCAAACTAC CAACAACTCT 2340 

AGCCAAGGTT TTTGTQGTAC TATGGGATCA GGAATGAAAA ATGGAGGGCA GGAAACCATT 2400 

GAAATGATGA AAGGAGGAAA CCAGACCTTG GAATCCTGCC GGGGGGCTGG GCATCATCAT 2460 

ACCCTGGACT CCTGCA66GG AGGACACAGG GAGGIGGACA ACTGCA6ATA CACTTACTGG 2520 

GAGTGQCACA GTTTTACTCA ACCCOGTCTC GGTGAAGAAT CCATTAGAGG ACACACTGGT 2580 

TAAAAATTAA ACATAAAAGA AATTGCAT06 ATGTAATCAG AATGAAGACC GCATGCCATC 2640 

CCAAGATTAT GTCCTC31CTT ATAACTATGA GGGAAGAGGA TCTCCAGCTQ aTTCXQTQGO 2700 

CTGCTGCAGT GAAAAGCAGG AAGAAGATGG CCTTGACTTT TTAAATAATT TGGAACCCAA 2760 

ATTTATTAC31 TTAGCAGAAG CATGCAC3UUV GAGATAATGT CACAGTGCTA CAATTAGGTC 2820 

TTTGTCAGAC ATTCTGGAGG TTTCCAAAAA TAATATTGTA AAGTTC3UVTT TCAACATGTA 2880 

TGTATATGAT GATTTTTTTC TCAATTTTGA ATTATGCTAC TCACCAATTT ATATTTTTAA 2940 

AGCCAGTTGT TGCTTATCTT TTCCAAAAAG T6AAAAATGT TAAA ACAGAC AACTGGTA AA 3000 

TCTCAAACTC CAGCACTGGA ATTAAGGTCT CTAAAGCATC TGCTCTTTTT TrTTTTTAOG 3060 

GATATTTTAG TAATAAATAT GCTGGATAAA TATTAGTCCA ACAATAGCTA AGTTATGCTA 3120 

ATATCACATT ATTATGTArP CACTTTAAGT GATAGTTTAA AAAATAAACA AGAAATATTG 3180 

AGTATCACTA TGTGAAGAAA GTTTTQGAAA AGAAACAATG AAGACTGAAT TAAATTAAAA 3240 

ATGTTGCAGC TCATAAAOAA TTGGGACTCA CCCCTACTGC ACTA CCAAAT TCATTTQACT 3300 

TTGGAGGCAA AATGTGTTGA A G TGCCCTAT 6AAGTA6CAA TTTTCTATAG GAATATAGTT 3360 

QGAAATAAAT GTGTGTGTGT ATATTATTAT TAATCAATGC AATATTTAAA ATGAAATGAG 3420 

AACAAAGAGG AAAATGGTAA AAACTTGAAA TGAGGCTGGQ GTATAGTTTG TCCTACAATA 3480 

GAAAAAA6AG AGAGCTTCCT AGGCCIGGGC TCTTAAATGC TGCATTATAA CTGAGTCTAT 3540 

GAG6AAATAG TTCCTGTCC» ATTTGTOTAA TTTGTTTAAA ATTGTA AATA A ATTAAA CTT 3600 

TTCTGGTTTC TGTQGQAAGG AAATAGG6AA TCCAAT06AA CAGTAGCTTT GCTTTGCAGT 3660 

CTGTTTCAAG ATTTCTGCAT CCACAAOTTA GTAGCAAACT 6GGGAATACT CGCTGCAGCT 3720 

GOGGTTCCCT GCTTTTT6GT AQCAAGGGTC CAGAGATGAG GTGTTTTTTT CGGGGAGCTA 3780 

ATAACAAAAA CATTTTAAAA CTTACCTTTA CTGAAGTTAA ATCCTCTATT GCTQTTTCTA 3840 

TTCrCTCTTA TAOTOACCAA CATCTTTTTA ATTTAGATCC AAATAACCAT GTCCTCXTTAO 3900 

AGTTTAGAGG CTAGAGQGAG CTGAGGGGA6 GATCTTACTG AAA6CACCCT GGGGAOATTG 3960 

ATTGTCCTTA AACCTAAGCC CCACAAACTT GACACCTGAT CAGGTCTGGG AGCTAOUUVA 4020 

TTTCATTTTT CTCCTCACTG CCCTTCTTCT GAGTQGCATT GGCCTQAATC AAGGAAAGCC 4080 

AGGCCTTGTG GGCCCXTCTTC TTTCGGCTTT CTGCTAAAGC AACACCTCCA GCAGAGATTC 4140 

CCTTAAGTGA CTCCAGGTTT TCCA<XATCC TTCAGCQTGA ATTAATTTTT AATC»GTTTG 4200 

CTTTCTCCAG AGAAATTTTA AAATAATAGA AGAAATAOAA ATTTTGAATG TATAAAAGAA 4260 

AAAGATCAAG TTGTCATTTT AGAACAGAGG GAACTTTGGG AGAAAGCAGC CCAAGTAGGT 4320 

TATTTGTACA GTCAGAGGGC AACAGGAAGA TGC3W3GCCTT CAAGGGCAAG GAGAGGCCAC 4380 

AAGGAATATG GGTGGGAGTA AAAGCAACAT CGTCTGCTTC ATACTTTTTC CTAGGCTTGG 4440 



386 
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CACTGCCTTT TCCTTTCTCA GG0CAATG6C AACTGCCATT T6AGTC0GGT 6AG6GATCA6 4500 

CCAACCTCTT CTCTATGGCT CACX:TTATTT GGAGTGAGAA ATCAAGGAGA CAC3AGCTQAC 4560 

TGCATGATGA GTCTGAA6GC ATTTGCAGGA TOAGCCTGAA CTGGTTQTGC AOAACAAACA 4620 

AGGCATTCAT GGGAATTGTT GTATTCCTTC TGCAGCCCTC CTTCTGGGCA CTAAGAAGGT 4680 

5 CTATGAATTA AATGCCTATC TAAAATTCTG ATTTATTCCT ACATTTTCTG TTTTCTAATT 4740 

TGACCCTAAA ATCTATGTGT TTTAGACTTA GACTTTTTAT TGCCCCCCCC CCCTTTTTTT 4800 

TTGAGA06GA GTCTCGCTCT GACGCACAGG CT6GAGT6CA GTGGCTCCGA TCTCTGCTGA 4860 

CTGAAAGCTC CGCCTCCCGG GITCATGCKA TTCTCCTGCC TCAGCCTCCT GAGTAOCTOG 4920 

GACTAGAGGC GCCXyVCCACC ACX3CCCX30CT AATTTTTTGT ATTTTTAATA GAGACGGGGT 4980 

10 TTCACTGTGT TAGCCAQGAT GGTCTCQATC TCCTGACCTC GTGATCCGCC TGCCTCGGCC 5040 

TCCCAAAGTG CTGGGATTAC AGGCATGACC CACCGCTCXrC GGCCTTGTTT TCCGTTTAAA 5100 

GTOSrCTTCT TTTAATQTAA TCATTTTGAA CATGTGTGAA AGTTOATCAT AGQAATTOGA 5160 

TCAATCTT6A AATACTCAAC CAAAAOACAG TOSAGAAQCC AGGGOOAOAA AGAACTCAGG 5220 

GCACAAAATA TTGGTCT6AG AATGGAATTC TCTGTAAGCC TAGTTOCTGA AATTTCCTGC 5280 

15 TGTAACCAGA AGCCAGTTTT ATCTAACX3GC TACTGAAACA CCCACTGTGT TTTGCTCACT 5340 

CCCACTCACC GATCAAAACC TGCTACCTCC CCAAGACTTT ACTAGT6CCG ATAAACTTTC 5400 

TCAAA6A6CA ACGAGTCATCA CTTCCCTOTT TATAAAACCT CTAACCATCX CTTTGTTCTT 5460 

TGRACATGCT GAAAACCACC TGGTCT6CAT GTATGCCCGA ATTTGTAATT CTTTTCTCTC 5520 

AAATGAAAAT TTAATTTTAG GQATTCATTT CTATATTTTC ACATATGTAG TATTATTATT 5580 

20 • TCCTTATATG TQTAAGGTQA AATTTATGGT ATTTGAGTGT GCAAGAAAAT ATATTTTTAA 5640 

AGCTTTCATT TTTCCCCCAG TGAATGATTT AGAATTTTTT ATGTAAATAT AC3VGAATGTT 5700 

TTTTCTTACT TTTATAAGGA AGCAGCTGTC TAAAATGCAG TGGGGTTTGT TTTQCAATGT 5760 

TTTAAACAGA 6TTTTA6TAT TGCTATTAAA AGAAGTTACT TTGCTTTTAA AGAAACTTGG 5820 

CTGCTTAAAA TAAGCAAAAA TTGGATGCAT AAAGTAATAT TTACAGATGT GQGGAGATGT 5880 

25 AATAAAACAA TATTAACTTG GCTGCTTAAA ATAAGCAAAA ATTGGATGCA TAAA6TAATA 5940 

TTTACAGATG TGGGGAGATG TAATAAAACA ATATTAACTT GGTTTCTTGT TTTTGCTGTA 6000 

TTTAGAGATT .AAATAATTCT AAGATGATCA CTTTGCAAAA TTATGCTTAT GGCTGGCATG 6060 

GAAATAGAAA TACTCAATTA TGTCTTTGTT GTATTAATGQ QGAATATTTT GGACAATGTT 6120 

TCATTATCAA ATTGTOSACA TCATTAATAT ATATTGTAAT GTTGGGAAGA GATCACTATT 6180 

30 TTGAAGCACA GCTTTACAGA TGAGTATCTA TGATACATAT GTKTAATAAA TTTTGATOGG 6240 

GTATTAAAAG TATTAGAAGG TGGTTATAAT TGCAGAGTAT TCCATGAATA GTACACTGAC 6300 

ACAGQGGTTT TACTTTGAGG ACCAGTQTAG TCAAGGGAAA ACATGAGTTA AAAAGAAAAG 6360 

CAGGCAATAT TGCAGTCTTQ ATTCTGCCSU: TTACAGGATA GATAATGCCT GAACTTTAAT 6420 

GACAAGATQA TCCAACXATA AAG0T6CTCT GTGCTTCACA GTGAATCTTT TCCCCATGCA 6480 

35 GGAGTGTGCT OCCCTACAAA COTTAAOACT GATCATTTCA AAAATCTATT AGCTATATCA 6540 

AAAGCCTTAC ATTTTAATAT AGGTTOAACC AAAATTTCAA TTCCAGTAAC TTCTATTGTA 6600 

ACCATTATTT TTGTGTATQT CTTCAAGAAT GTTCATTGGA TTTTTGrTTG TAATAGTAAA 6660 

ATACOGGATA CATTTCACGT GTCCTTCAGT ATTGATTTGG TTGAATATTG GGTCATAATG 6720 

GrrOAGAAGC ATG6ACACTA GAGCCAGAAT 6CTTGGATAT GAATCCTGGA TCTGTCACTT 6700 

40 ACTTCTGTOT GAOCTTTQAA AGGCTACTTA TTTCCTCTCT TaGCTTTCTC ATTAAAATCA 6840 

ATGAACAAT6 CCAGCCTCAT GGGGTT6TTG AATGATTAAA TTAGTTAATA TACCTAAAST 6900 

ACATAGAACA CTGCCTGCAC ATA6TAAAAG AATTATAAGT GTGAGGTAGT TGGTAAAATT 6960 

ATGTAGTTGG ATATACTACC GAACAATATC TAATCTCTTT TTA6G6AAAT AAAGTTTGI6 7020 
. _ CaTATATATA ATCCCGAAAC ATG 

45 

Seq ID MO: 527 Protein sequence 
Protein Accession #i NP_077741.l 

21 31 41 51 

1111 

TLVIFSRDGB ACKKVILNVP SKLEADKIIG RVNLEECFRS 60 

TARAVALSDK KRSFTIWLSD KRKQTQKBVT VUtEHQKKVS 120 

PCSMQENSIiG PFPLFLQQVE SDAAQNYTVP YSISGRGVDK 180 

VDREEYDVFD IiIAYASTADG Y8ADLPLPLP IRVEDENDNK 240 

TV3WCATSR DBPOTMHTRL KITSILQQTPR SPGLFSVHPS 300 

LZMKVQDMDG QFFGLXGT8T CIITVTDSND NAPTFRQNAY 360 

KDLINTAMWR VNETILKGHE NGHFKISTDK ETNEGVLSW 420 

APFASDIPRV TALNRALVTV HVRDLDE6PE CTPAAQYVRI 480 

RNGNGLRYKK LHDPK6WITI DEISGSIITS KILDREVETP 540 

GTLAVNIEDV NDNPPEILQB YWICKPKMG YTDILAVDPD 600 

RLWSLTKVND TAARLSYQKN AQFQEYTIPI TVKDRAGQAA 660 

SRSTGVItiGR WAILAXIiLGI ALLFSVLLTL VGGVFGATX6 720 

GDDRVCSANG FMTQTTNNSS QGFCGTM0S6 MKNGGQBTIB 780 
LDSCRGGHTE VDNCRYTYSE WHSFTQPRLG BESIRGHTG 

Seq ID NO: 528 DNA sequence 

MUcIeic Acid Accession #: NM_001941.2 

coding sequence: 64.. 2754 

70 1 11 21 31 41 51 

1 i I I I I 

GGCAG6TCTC GCTCTC6GCA CCCTCCCGGC GCCCGCGTTC TCCTGGCGCT GCCGGGCATC 60 

C08ATGGCX» CCGCTOQGGC CCGG06CTCC GTGCGOGGAO COQTCTGCCI OCATCIGCTG 120 

CTGACCCTCQ T6ATCTTCA6 TCGTGATGGT GAA6CCT6CA AAAAQGT6AT ACTTAATGTA ISO 

75 CCTTCTAAAC TA6AGGCAGA CAAAATAATT GGCAGAQTTA ATTTGGAAGA GTGCTTCaGG 240 

TCTGCAGACC TCATCCXjGTC AAGTGATCCT GATTTCAGAG TTCTAAATGA TGGGTCAGTG 300 

TACACAGCCA GGGCTGTTGC GCTGTCTGAT AAGAAAAGAT CATTTACCAT ATGGCTTTCT 360 

GACAAAAGQA AACASACACA GAAAGAOGTr AGTOTGCTGC TAGAACATCA GAAGAAGGTA 420 

TOSAAGACAA GACACACTAG AGAAACTGTT CTCAGGCGTG CCAAGAG6AG ATGGGCACCT 480 

80 ATTCCTT6CT CTATGCAAQA GAATTCCTTQ GGCCCTTTCC CATT6TTTCT TCAACAAGTT 540 

GAATCTGATG CAGCACAGAA CTATACTGTC TTCTACTCAA TAAGTGGACX3 TQQAGTTGAT 600 

AAAGAACCTT TAAATTTGTT TTATATAGAA AGAGACACTG GAAATCTATT TTGCACTCGG 660 

CCTGTG6ATC GTGAAGAATA TOATGTTTTT . GATTTQATTG CTTATGGQTC AACTGCAGAT 720 

GGATATTCAG CAGATCTGCC CCTCCCACTA CCCATCAGOO TAOAGGATGA AAATGACAAC 780 

85 CACCCTGTTT TCACAGAAGC AATTTATAAT TTTGAAGTTT TGGAAAGTAG TAGACCTGGT 840 

ACTACAGTGG GGGTGQTTTG TGCCACAGAC AGAGATGAAC C3GGACACAAT GCATAOSCGC 900 

CTGAAATACA GCATTTTGCA GCAGACACCA AGGTCACCTG GGCTCTTTTC TGTGCATCCC 960 



50 I I 

MAAA6PRRSV RGAVCLHLLL 
ADLIRSSDPD FRVLNDG8VY 
KTRETRBTVL RRAKRRWAPI 
BPIMiFYIEai DTGNIiPCTRP 

55 FVFTBAIYNF EVLBSSRFOT 
TOVITTVSHY LDRBWDKYS 
EAFVEENAFN VBILRIPIED 
KPLNYEENRQ VNLEIGVNNE 
KENLAVGSKI HGYKAYDPEN 

60 XHELYNITVL AIOKDDRSCT 
BPVHOAPFYF SLPNTSPEIS 
TKLItRVNLCE CTHPTQCRAT 
KRFPEDLAQQ NIiIISNTSAP 
HMKGGNQTLE SCRGAtaiHHT 
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A6CACA66C6 TAATCAOCAC AGTCTCTCAT TATTTGGACA GAGA6GTTGT A<»AGAAGTAC 1020 

TCATTGATAA TOAAAGTACA AGACATGOAT GGCCM3TTTT TTGGATTOAT AC3GCACATCA 1080 

ACTTGTATCA TAACAGTAAC ASATTCAAAT OATAATQCAC CCACTTTCAG ACAAAATGCT 1140 

TATGAAGCAT TTGTAGAOGA AAATGCATTC AATOTGGAAA TCTTACGAAT ACCTATAGAA 1200 

OATAAGGATT TAATTAACAC TGCCAATTGG AGAGTCAATT TTACCATTTT A AAGG GAAAT 1260 

GAAAATGOAC ATTTCAAAAT CAGCACAGAC AAAGAAACTA ATGAAGGTGT TCTTTCTGTT 1320 

CnAAASCCAC TGAATTATOA AGAAAACOZT CAAGTGAACC TGGAAATTGG AGTAAACAAT 1380 

QAA6CGCCAT TT6CTA6AGA TATTCOCAGA GTGACAQCCT TGAACAGAGC CTTGGTTACA 1440 

GTTCATGTGA GGGATCTGGA TGAGGGGCCT GAATGCACTC CTGCAGCCCA ATATGTGCX3G IS 00 

ATTAAAGAAA ACTTAGCAGT GGGGTCAAAG ATCAACGGCT ATAAGGCATA TGACCCOQAA 1560 

AATAGAAAT6 GCAATGGTTT AAGGTACAAA AAATTGCATG ATCCTAAAGQ TTQQATCACC 1620 

ATTOATGAAA TTTCAQOQTC AATCATAACT TCCAAAATCC TGGATAGGQA GGTTGAAACT 1680 

CCCftAAAATO AOTTGTATAA TATTACAOTC CTGGCAATAG ACAAAGATGA TAGATCATGT 1740 

ACTGGAACAC TTGCTGTGAA CATTGAAGAT GTAAATGATA ATCCACCA6A AATACTTCAA 1800 

GAATATGTAG TCATTTGCAA ACCAAAAATG GGQTATACCG ACATTTTAGC TGTTGATCCT 1860 

GATGAACCTG TCCATGGAGC TCCATTTTAT TTCAGTTTGC CCAATACTTC TCCAGAAATC 1920 

AGTAGACTGT GGAGCCTCAC CAAAGTTAAT GATACAGCTG CCCGTCTTTC ATATCAGAAA 1980 

AATGCTG6AT TTCAAGAATA TACCATTCCT ATTACTGTAA AAGACAGGGC CG6CCAA6CT 2040 

GCAACAAAAT TATTGAQAGT TAATCTGTGT GAATGTACTC ATCCAACTCA GTGTCGT6C6 2100 

ACTTCAAGQA GTACAGGAGT AATACTTGGA AAATGGGCAA TCCTTGCAAT ATTACTGGQT 2160 

ATAGCACTGC TCTTTTCTGT ATTGCTAACT TTAGTATGTG GAGTTTTTGG T6CAACTAAA 2220 

OGGAAAOBTT TTCCTCAAGA TTTAGCACAG CAAAACTTAA TTATATCAAA CACAQAAGCA 2280 

CCTGGA6A06 ATAGAGTGTG CTCTGCCAAT GOATTTATGA CCCAAACTAC CAACAACTCT 2340 

AGCCAAGGTT TTTGTGGTAC TATGGGATCA GGAATOAAAA ATGGAGQGCA GGAAACCATT 2400 

GAAATGATGA AAQGAGGAAA CCAGACCTTO QAATCCTGCC GGGGQGCTGG GCATCATCAT 2460 

ACXCTGGACT CCTGCS^SGGG AGGACACAC» GAGGTG6ACA ACTGCAGATA CACTTACTOG 2520 

GAGTGGCACA GTTTTACTCA ACCCCGTCTC GGTGAAAAAT .TGCATCGATG TAATCAQAAT 2580 

QAAQACCGCA TGCCATCCCA AGATTATGTC CTCACTTATA ACTATGAGGG AAQAGGATCT 2640 

CCAGCTGGTT CTGTGGGCTG CTGCAOTGAA AA6CAGGAA6 AA6ATGGCCT IGACTTTTTA 2700 

AATAATTTGG AACCCAAATT TATTACATTA GCAGAAQC*AT GCACAAAGAG ATOATGTCAC 2760 

AGTGCTACAA TTAGGTCTTT GTCAGACATT CTGGAGGTTT CXSUUUiATAA TATTGTAAAG 2820 

TTCAATTTCA ACATGTATGT ATATGATGAT TTTTTTCTCA ATTTTQAATT ATGCTACTCA 2880 

CCAATTTATA TTTTTAAAGC CAGTTGTTGC TTATCTTTTC CAAAAAGTGA AAAATGTTAA 2940 

AACAGACAAC TGGTAAATCT CAAACTCCAG CACTGGAATT AAGGTCTCTA AAGCATCTGC 3000 

TCTTTTTTTT TTTTACGGAT ATTTTAGTAA TAAATATGCT GGAXAAATAT TAGTCCAACA 3060 

ATA6CTAAGT TATGCTAATA TCACATTATT ATGTATTCAC TTTAAGTGAT AGTTTAAAAA 3120 

ATAAACAAGA AATATTGAGT ATCACTATGT GAAGAAAGTT TTGGAAAAGA AACAATGAA6 3180 

ACTGAATTAA ATTAAAAATG TTGCAGCTCA TAAAGAATTG GGACTCACCC CTACTGCACT 3240 

AGCAAATTCA TTTQACTTTG GAGGCAAAAT GTGTTGAAGT GCCCTATGAA GTAGC3VATTT 3300 

TCTATAGGAA TATAGTTGGA AATAAAT6T0 TOTGTGTATA TTATTATTAA TCAATGCAAT 3360 

ATTTAAAATG AAAT6A6AAC AAA6AGGAAA AT66TAAAAA CTTGAAATGA GGCTGGGGTA 3420 

TAGTTTGTCC TACAATAGAA AAAAGAGAGA GCTTCCTAGO CCTGGQCTCT TAAATGCTGC 3480 

ATTATAACTG AGTCTATGAG GAAATA6TTC CTGTCCAATT TGTGTAATTT GTTTAAAATT 3540 

GTAAATAAAT TAAACTTTTC TGGTTTCTGT GGGAAGQAAA TAGQGAATCC AATQGAACAG 3600 

TAGCTTTGCT TTGCAGTCTG TTTCAAGATT TCTGCATCXA CAAGTTAGTA GCAAACTGG8 3660 

GAATACTCGC TGCAGCTGGG GTTCCCTOCT TTTTG6TAGC AAGG6TCCA0 AGATGAG6TQ 3720 

TTTTTTTCGG GQAGCTAATA ACAAAAACAT TTTAAAACTT ACCTTTACTG AAGTTAAATC 3780 

CTCTATTGCT GTTTCTATTC TCTCTTATAG TGACCAACAT CTTTTTAATT TAGATCCAAA 3840 

TAACCATGTC CTCCTAGAGT TTAGAGQCTA GAGGGAGCTO AGGQGAGGAT CTTACTGAAA 3900 

aCAC3CCTGGG GAGATTGATT GTCCTTAAAC CTAAOCCCCA CAAACTTGAC ACXTPGATCAG 3960 

GTCTGGGAGC TACAAAATTT CATTTTTCTC CTCACTGCCC TTCTTCTGAG TOGCATTGGC 4020 

CTGAATCAAG GAAAGCCAGG CCTTGTGGGC CCCCTTCTTT CGGCTTTCTG CTAAAGCAAC 4080 

ACCTCCAGCA GAGATTCCCT TAAGTGACTC CAGGTTTTCC ACCATCCTTC AGCGTGAATT 4140 

AATTTTTAAT CAGTTTGCTT TCTCCAGAGA AATTriAAAA TAATAGAAGA AATAGAAATT 4200 

TTQAATOTAT AAAAGAAAAA QATCAA6TT0 TCATTTTAQA ACAGAGGGAA CTTTGGGAfiA 4260 

AAGCAGCCCA AGTAGGTTAT TTGTACAGTC AGAGG6CAAC AGGAAGATGC AGGCCTTCAA 4320 

GGGCAAGGAG AGGCCACAAG GAATATGGGT GGGAGTAAAA GCAACATCGT CTGCT TCATA 4380 

CTTTTTCCTA GGCTTG6CAC TGCCTTTTCC TTTCTCAGGC CAATGGCAAC TGCCATTTGA 4440 

GTCCGGTGAG QQATCAGCCA ACCTCTTCTC TATGGCTCAC CTTATTrGGA QTGAGAAATC 4500 

AAGGAGACA6 AOCTGACTGC AIGATGASTC TGAAGGCATT TGCAGGATGA GCCTGAACTG 4560 

QTTGT6CAGA ACAAACAAGG CATTCATGGG AATTGTTGTA TTCCTTCTGC AGCCCTCCTT 4620 

CTGGGCACTA AGAAGGTCTA TGAATTAAAT GCCTATCTAA AATTCTGATT T ATTCC TACA 4680 

TTTTCTGTTT TCTAATTTGA CCCTAAAATC TATGTGTTTT AGACTTAGAC TTTTTATTGC 4740 

CCCCCCCCCC TTTTTTTTTG AGACGGAGTC TOGCTCTGAC GCACAGGCTG GAGTGCAGT6 4800 

GCTCCGATCT CTGCTCACTG AAAGCTCCGC CTCCCGGGTT CATGCCATTC TCCTGCCTCA 4860 

GCCTCCTGAQ TAQCTQGGAC TACAGGCGCC CACCACCACG CCCX3GCTAAT TTTTTGTATT 4920 

TTTAATAGAG AOGGGGTTTC ACTGTGTTAG CCAGGATGGT CTCGATCTCC TeACCTOGTG 4980 

ATCCGCCTGC CTCG6CCTCC CAAAGTGCTG GGATTACAGG CATQACCCAC O6CTCCO0GC 5040 

CTTGTTTTCC GTTTAAAGTC GTCTTCTTTT AATGTAATCA TTTTGAACAT 6TGTGAAAQT 5100 

TQATCATACX3 AATTGGATCA ATCTTQAAAT ACTCAACCAA AAGACAGTCG AGAAGCCAQQ 5160 

GG6AGAAAGA ACTCAGGGCA CAAAATATTG GTCTGAGAAT GGAATTCTCT GTAAGCCTAG 5220 

TTGCTGAAAT TTCCTGCTGT AACCAQAAGC CAGTTTTATC TAAOGGCTAC TGAAA CACC C 5280 

ACTGTGTTTT GCTCACTCCC TCACTCACGG ATCAAAACCT GCTACCTOCC CAAGACTTTA 5340 

CTAGTGCCGA TAAACTTTCT CAAAGAGCAA CCAGTATCAC TTCCCTGTTT ATAAAACCTC 5400 

TAACCATGTC TTTGTTCTTT GAACATGCTG AAAACCACCT GGTCTGCATG TATGCCOGAA 5460 

TTTOTAATTC TTTTCTCTCA AATOAAAATT TAATTTTAGG GATTCATTTC TATATTTTCA 5520 

CATATGTAGT ATTATTATTT CCTTATATGT GTAAGGTGAA ATTTATGGTA TTTGAGTGTG 5580 

CAAOAAAATA TATTTTTAAA GCTTTCATTT TTCGCXXAOT GAATGATTTA GAATTTTTTA 5640 

TGTAAATATA CAGAATQTTT TTTCTTACTT TTATAAGGAA GCAGCTGTCT AAAATGCA6T S700 

6GGGTTTGTT TTGCAATGTT TTAAACAGAG TTTTAGTATT GCTATTAAAA GAAGTTACTT 5760 

TGCTTTTAAA GAAACTTGGC TGCTTAAAAT AAGCAAAAAT TGQATGCATA AAGTAATATT 5820 

TACAGATGTG GGGAGAT6TA ATAAAACAAT ATTAACTTGG TTTCTTGTTT TTGCTGTATT 5880 

TAGAGATTAA ATAATTCTAA GATGATCACT TTGCAAAATT ATGCTTATGG C113GCATGGA 5940 

AATAGAAATA CTCAATTATG TCTTTGTTGT ATTAATGGGO AATATTTTG6 ACAATGTTTC 6000 

ATTATCAAAT TGTCGACATC ATTAATATAT ATTGTAATGT TQGGAAGA6A TCACTATTTT 6060 

GAAGCACAGC TTTACAGATG AGTATCTATG ATACATATGT ATAATAAATT TTGATCGGGT 6120 

ATTAAAAGTA TTAGAAGGTG GTTATAATTG CAGAGTATTC CATGAATA6T ACACTGACAC 6180 
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AGGGGTTTTA CTTTGAGGAC CAGT6TA6TC AAGGGAAAAC 
GGCAATATTG CAGTCTTGAT TCTGCCACTT ACS^GGATAGA 
CAAGATGATC CAACXATAAA GGTGCTCTGT GCTTCACAGT 
AGTGT6CTCC CCTACAAAC36 TTAAGACT6A TCATTTCAAA 
AGCCTTACAT TTTAATATAG GTTGAACCAA AATTTCAATT 
CATTATTTTT GTGTATGTCT TCAA6AAT6T TCATTGGATT 
ACCGGATACA TTTCACQTGT CCTTCAGTAT TGATTTGGTT 
TQAGAAGCAT GGACACTAGA GCCAQAATGC TTGGATATGA 
TTCT6TGTQA CCTTT6AAAG GCTACTTATT TCCTCTCTTA 
OAACAATGCX: A6CCTCATG6 GGTTGTTGAA TGATTAAATT 
ATAOAAGACT GCCTGCACAT AGTAAAAGAA 7TATAA0TGT 
GTASTTOGAT ATACTACC6A ACAATATCTA. ATCTCTTTTT 
TATATATAAT CCOGAAACAT G 

Seq ZD NO I 529 Protein sequence 
Protein Accession ft t NP_001932.1 



MAAAGPRRSV 
ADLIRSSDFD 
KTRRTRETVL 
BPIiHLFYIER 
PVFTEAIYNP 
TGVITTVSBY 
EAFVEEHAFN 
KPLNVESNRQ 
KENIiAVGSKI 
KNELYHITVli 
SFVaGAPFYF 
TKLURVNLCE 
KRFPGDLAQQ 
MMKGGNQTIiE 
DRMPSQDYVL 



11 
I 

RGAVCLHLLL 
FRVLNDGSVY 
RRAKRRWAPI 
DTGNLFCTRP 
BVLBSSRPGT 
LDRSWDKYS 
VEIIAIPIBD 
VNLEIOVNNE 
NGYKAYDPEN 
AIDKDDRSCT 
SLPNTSPBIS 
CTHPTQCRAT 
NLIISNTEAP 
SCRGAGHKHT 
TYNYEGRGSP 



21 

I 

TLVIFSRDGE 
TAXtAVALSDK 
PCSMQENSLG 
VDREBYDVFD 
TVGWCATDR 
LXHKVQDMD6 
KDLINTANWR 
APFARDIPRV 
RNQJGLRYKK 
GTLAVNXEDV 
RLWSIiTKVND 
SRSTGVIIiGK 
GDDRVCSAKG 
IiDSCRGGHTE 
AGSVGCCSSK 



31 

I 

ACKKVILNVP 
KSSFTIWLSD 
PPPLFLQQVB 
LIAYASTADG 
DEPDTMHTRL 
QFFGLZGTST 
VMFTIIiKGNE 
TALNRALVTV 
LHOPKGHITI 
NDNPPBIIiQE 
TAARLSYQKN 
WAZLAIIiLGI 
FMTQTIT7NSS 
VDNCRYTYSE 
QEBDGLDFIJI 



ATQAGTTAAA 
TAATGCCTGA 
QAATCTTTTC 
AATCTATTAG 
CCAGTAACTT 
TTTGTTT6TA 
GAATATTGG6 
ATCCTGGATC 
GCTTTCTCAT 
AGTTAATATA 
QAGGTAOTTG 
AGG6AAATAA 



41 

I 

SKt£ADKZIG 
KRKQTQKEVT 
SDAAQNYTVF 
YSADLPLPLP 
KYSILQQTPR 
CIITVTOSND 
NGBFKISTDK 
HVRDLDEGPE 
DEISGSIITS 
YWICKPKMG 
AGFQSYTIPI 
ALLFSVIjLTL 
QGFCX3TM6SG 
WHSFTQPRLG 
NLEPKFITLA 



Seq ID NO: 530 DNA sequence 

Nucleic Acid Accesaion #< NM_016583.2 

Coding sequence : 72 . . 842 



1 
I 

GGAGTG6GGG 
TAAGAGCAAA 
CCATGGCCXA 
ATCCAGCCCT 
ATGGCCTGCT 
TOAAGOCTSG 
CAGT6ATTCC 
AACTTGGCCT 
TAAAGCTCCA 
TGGACATCAC 
TT6GTGACT6 
CCCTCCrCAT 
AGTTGGTTCA 
CCCTGGTGCA 
AA6CCTTCCA 
GGCCATGTGC 
TCCCAOC!AG6 
AAAAAAAAAA 



11 
I 

AGA6AGAGGA 
GATGTTTCAA 
GTTTGGAGGC 
GCCCTTGAGT 
GTCTGGGGGC 
AGGA6GTACT 
TGGCCTGAAC 
T6TGCA6A6C 
A6TGAATACG 
TGCAGAAATC 
CAOCCATTCC 
TCAAGGTCTT 
GGGCAACX3TG 
TGACATT6TT 
GGAAGG66CT 
TGOAAGATOA 
OGTGTGTAAC 
AAAAAAAAAA 



21 

I 

GAOCftOGACA 
ACTGGGGGCC 
CTGCCOGTGC 
CCCACAGGTC 
CTGTTGGGCA 
TCTGGTGOCC 
AACATCATTG 
CXnXSATGGCfC 
CCCCTOQTCQ 
TTAGCTGTGA 
CCTGGAAOCC 
CTGQACAOC!C 
TGCCCTCTGG 
AACATGCTGA 
GGCCTCTGCT 
CACAGTTOCC 
ATCCCATGTG 
AAAAAAAAA 



31 
I 

GCTGCTGAGA 
TCATTGTCTT 
CCCTGGACXyV 
TTGCAGGAAG 
rrCTGGAAAA 
TOCTTOGGGG 
ACATAAAGGT 
ACC3GTCTCTA 
QTGCAAGTCT 
GAGATAAGCA 
TGCAAATTTC 
TCACAGK36AT 
TCRATGAGGT 
TCCACGGACT 
GAGCTGCTTC 
TTCTCTCCGA 
CCTCACCTAA 



41 
I 

CCrCTAAGAA 
CTACGGGCTG 
GACCCTGCCC 
CTTGACAAAT 
CCTTGCGCTC 
ACTGCTTGGA 
CACTGACCCC 
TGTCACCATC 
GTTGAGGCTG 
GGAGAG6ATC 
TCTQCTTGAT 
CTTGAATAAA 
TCTCAGAGGC 
ACAGTTTGTC 
CCAGTGCTCA 
6GAAGCTGCC 
TAAAATGOCT 



AA6AAAAGCA 
ACTTTAATGA 
CXTCATGCAGG 
CTATATCAAA 
CTATTGTAAC 
ATAGTAAAAT 
TCATAATGGT 
TGTCACTTAC 
TAAAATCAAT 
OCTAAAGTAC 
OTAAAATTAT 
AGTTTGTGCA 



51 
I 

RVNI^CFRS 
VLIiEHQKKVS 
YSISGRGVDK 
ZRVEDHNDMH 
SPGLFSVHPS 
KAPTFKQNAY 
ETNEGVLSW 
CTPAAQYVRZ 
KILDREVETP 
YTDllAVDPD 
TVKDRAGQAA 
VCGVFGATKG 
MKN6GQETIE 
EKLHRCHQKE 
BACTKR 



51 
I 

GTCGAGATAC 
TTAGCCCAGA 
TTGAAT6TGA 
QCCCTCAGCA 
CTGGACATCC 
AAAGT6ACGT 
CAGCTGCT6G 
CCTCTCGGCA 
GCTGTGAAGC 
CACCTGGTCC 
GGACTTQGCC 
0TCCT6CCT6 
TTGGACATCA 
ATCAAGGTCT 
CA6ATG6CTG 
COCTCTCCTT 
CTTCTTCTGC 



Seq ID NO: 531 Protein sequence 
Protein Accession fti 2IP_057667.1 

1 11 21 31 41 51 

I I I I I i 

MFQTGGLZVF YGLLAQTMAQ FGGIiFVPLDQ TLPLNVNPAIi PLSPTGLAGS LTNALSKGLL 
SGGLUSlhW IfliLDXLRPO GGT8GGLLG6 LLGKVTSVIP GLNNIIDIKV TDPQLIiEIiGIi 
VQSPDOHRLY VTXPLQIXLQ VNTPLVGASIi LRLAVKLDIT AEILAVRDKQ ERIHLVLGDC 
THSPGSLQZS IiLDGLQPLPZ QGIiLDSLTGI UnCVLPELVQ GNVCPLVMEV LRGLDITLVU 
DlVNHI.IiKa> QFVIKy 



6240 
6300 
6360 
6420 
6480 
6540 
6600 
6660 
6720 
6760 
6640 
6900 



PCT/US02/12476 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 



60 
120 
IBO 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 



60 
120 
180 
240 



75 



80 



85 



Seq ZD NO: 532 DNA sequence 

Nucleic Add Accesaion ft: NM_004363.1 

Coding sequence » 115.. 2223 * 



1 
I 

CTCACSGGCAG 
TCCTGGAACT 
TCTCCCTCGG 
TCACTTCTAA 
TTCAATGTCG 
TTTGGCTACA 
GTAATAGGAA 
CCCAATGCAT 



11 
I 

A6GGAG6AAG 
CAAGCTCTTC 
CCCCXCCCCA 
CCTTCTGGAA 
CAGAGGGGAA 
6CTGGTACAA 
CTCAACAAGC 
CCXT6CTGAT 



21 

I 

GACA6CAGAC 
TCCACAGAGG 
CAGAT6GTGC 
CCGGCCCACC 
GGAGGTGCTT 
AGGTGAAAGA 
TACCCCA66G 
GCAGAACATC 



31 
I 

CAGACAGTCA 
A06ACAGAGC 
ATCCCCTG6C 
ACTGCCAAGC 
CTACTTGTCC 
GTGGATGGCA 
CCOGCATAGA 
ATCCAGAATG 



41 

I 

CAGCAGCCTT 
AGACAQCAGA 
AGAGGCTCCT 
TCACTATTGA 
ACAATCTGCC 
ACCGTCAAAT 
GTGGTCGAOA 
ACACA6GATT 



51 

I 

GACAAAACGT 
GACCATGGAG 
GCTCACAGCC 
ATCCAOGCCG 
CCAGCATCTT 
TATAGGATAT 
(gCTAA TATA C 
CTACACCCTA 



60 
120 
180 
240 
300 
360' 
420 
480 



389 
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CACGTCATAA AGTCAGATCT TGTGAATGAA GAAGCAACTG GCCAGTTCCG GGTATACCCG 540 

GAGCTGCOCA A6CCCTCCAT CTCO^GCAAC AAC7CCAAAC COGTGGAGOA CAAGGATQCT 600 

G7G6CCTTCA CCTGTGAACC TQAGACTCAG GACGCAACCT ACXrrOTGGTG GGTAAACAAT 660 

CAGAGCCTCC CGGTCAGTCC CAGGCTQCAG CTOTCCAATG GCAACAQOAC CCTCACTCTA 720 

TTCAATGTCA CAAGAAATGA CACAQCAAGC TACAAATGTG AAACCCAGAA CCCAGTGAGT 780 

6CCAGGCX5CA GTGATTCAGT CATCCTQAAT GTCCTCTATO GCCCGGATGC CCCCACCATT 840 

TCCCCTCTAA ACACATCTTA CAGATCAGGG GAAAATCTGA ACCTCTCCTG CCACGCAGCC 900 

TCTAACCCftC CTGCACAGTA CTCTTGGTTT GTCAATGOGA CTTTCCAGCA ATCCACXX^^ 960 

GAGCTCTTTA TCCCCMCAT CACTGTGAAT AATAGTGGAT CCTATACXJTG CCAAGCCCAT 1020 

AACTCAGACA CTGGCCTCAA TAGGACCACA GTCACGAOGA TCACAGTCTA TGCAOAGCCA 10 80 

CCCAAACCCT TCATCACCAG CAACAACTCC AACCCCGTGG AGGATGAGGA TGCTGTAGCC 1140 

TTAACCTQTO AA0CTGA6AT TCAGAACACA ACCTACCTGT GGTGGGTAAA TAATCAGAGC 1200 

CTOCCGGTCA (3TC0CAGGCT GCAGCTGTCC AATGACAACA GGACCCTCAC TCTACTCAQT 1260 

GTCACAAGGA ATGATGTAGG ACCCTATGAG TGTGGAATCC AGAACGAATT AAGTGTTGAC 1320 

CACAGOGACC CAGTCATCCT GAATGTCCTC TATGGCCCAG ACOACCOCAC CATTTCCCCC 1380 

TCATACACCT ATTACOGTCC AG6GGTGAAC CTCAGCCTCT CCTOCCATGC AGCCTCTAAC 1440 

CCACCTGCAC AGTATTCTTG GCTCATTGAT GGGAACATCC AGCAACACAC ACAAGAGCTC 1500 

TTTATCTCCA ACATCACTGA GAAGAACAGC GGACTCTATA CCTGCCAGGC CAATAACTCA 1560 

GCCAGTGGCC ACAGCAGGAC TACAOTCAAG ACAATCACAQ TCTCTGOGGA GCTGCCCAAG 1620 

OCCTCCATCT CCAGCAACAA CTCCAAACCC GTGGA6GACA A6GATQCTGT GGCCTTCACC 1680 

TGTGAACCTG AGGCTCAGAA CACAACCTAC CT6TG6T6GG TAAATGGTCA GAGCCTCCCA 1740 

GTCAGTCCCA GGCTQCAGCT GTCCAATGGC AACAGGACCC TCACTCTATT CAATOTCACA 1800 

AOAAATQACG CAAGAGCCTA TGTATGTGGA ATCXaGAACT CAGrTGAGTGC AAACCGCAGT 1860 

GACCCAGTCA OCCTGGATGT CCTCTATGGG COGGACACCC CCATCATTTC CCCXXXaGAC 1920 

TOGTCTTACC TTTOGGGAGC GAACX7CAAC CTCTCCTGCC ACTCGGCXTC TAA COCAT OC 1980 

COGCAGTATT CTTGGCGTAT CAAT666ATA CCGCAGCAAC ACACACAA6T TCTCTTTATC 2040 

GCCAAAATCA CGCCAAATAA TAAOGGGACC TATGCCTQTT TTGTCTCTAA CTTGGCTACT 2100 

GGCCGCAATA ATTCCATAGT CAAGAGCATC ACAGTCTCTG CATCTGGAAC TTCtCCTGGT 2160 

CTCTCAGCTG GGGCCACTGT CGQCATCATG ATTGGAGTOC TGGTT GGGGT T GCTCTGAT A 2220 

TAGCAfiCCX? GOTGTAGTTT CTTCATTTCA GGAAGACIGA OUITTGTTTT GCTTCTTCCT 2280 

TAAAGCATTT GCAACAGCTA CAC3TCTAAAA TTSCTTCTTT ACCAAGGATA TTTACAGAAA 2340 

AGACTCTGAC CAGAGATCGA GACCATCCTA GCCAACATGG TGAAACCCCA TCTCTACTAA 2400 

AAATACAAAA ATGAGCTGOG CTTGGTGGCQ CGCACCrGTA GTCCCAGTTA CTCGGGAGGC 2460 

TGAGGCAGGA GAATCQCTTG AACCCGGGAG GTGGA6ATTG CAGTGAGCCC AGATCGCACC 2520 

ACTGCACrCC aGTCIOGCAA CAOAGCAAGA CTCCATCTCA AAAAGAAAAG AAAAGAAGAC 2580 

TCTOACCTOT ACTCTTGAAT ACAACTTTCT GATACCACTG CACTGTCTGA GAATTTOCAA 2640 

AACTTTAATG AACTAACTGA CAGCTTCATG AAACTGTCCA CCAAG ATCA A GCAGAGAAAA 2700 

TAATTAATTT CATGGGACTA AATGAACTAA TGA(3GATTGC TGATTCTTTA AATQTCTTOT 2760 

TTCCCAGATT TCAGGAAACT yrrm ' Ci ' TT TAAGCTATCC ACTCTTACAO CAATTTGATA 2820 

AAA3ATACTT TTGT6AACAA AAATTQAGAC ATTTACATTT TCTCCCTATO T60T06CTCC 2880 

AGACTTGG6A AACTATTCAT GAATATTTAT ATTGTATGGT AATATAGTTA TTGCACAAOT 2940 
TCAATAAAAA TCTGCTCTTT GTATAACAQA AAAA 

Seq ID HO: 533 Protein eeguence 
Protein AccesBion #i NP_004354.1 

1 H 21 31 41 51 

I t I I I I 

MESPSAPPHR WCIPHQRLLL TASLLTFVJNP PTTAKIiTIES TPPNVAEGKB VLLLVHNIiPQ 60 

HLFGYSWYXG ERVDCaiSQII GYVIGTQQAT PGPAYSGREI lYPNASLLIQ HIIQHDTGFY 120 

TUBVIKSDLV HBEATOQFRV YPBLPKPSZS SNNSKPVEDK DAVAFTCEPE TQBATYLNWV 180 

NKQSLPVSPR LQLSNGNRTL TLFNVTRNDT ASYKCBTQNP VSARRSDSVI LNVLYGPDAP 240 

TISPLNTSYR SGBNLNLSCH AASNPPAQYS WFVNGTFQQS TQELPIPNIT VNHS6SYTCQ 300 

AHNSDTGLNR TTVTTITVYA EPPKPPITSN NSNPVEDEDA VALTCEPEIQ NTTYLWWVNN 360 

QSIiPVSPRIiQ LSNDHRTLTL IjSVTRNDVGP YEOGIQNELS VDHSDPVILN VLYGPDDPTl 420 

SPSYTYYRPG VNLSLSCHAA SNPPAQYSWL IDGMIQQHTQ ELPISNITEK NSGLYTOQAN 480 

MSASGHSRTT VKTITVSAEL PKPSISSNMS KPVEDKDAVA FTCEPEAQNT TYLWWVK OQS 540 

LPVSPRLQLS NGNRTLTLFN VTHNDARAYV OGIQNSVSAN RSDFVTliXyVIi YOPDTPIISP 600 

PDSSYLSGAN LMLSCHSASN PSPQYSHRIK GIPQQBTQVL FIAKITPmm GTYACFNTSNL 660 
AT6RNNSIVK SITVSAS6TS PGIiSAGATVG IMIGVLVGVA LI 

Seq ID NO: 534 mn sequence 

Nucleic Acid Accession #t NM_006952.1 

Coding sequence: 11.. 793 

1 11 21 31 41 51 

i 1 I I I I 

AATCCCGACA ATGGOOAAAO ACAACTCAAC TGTTOGTTGC TTCCA66GCC TGCTGATTTT 60 

TGGAAATGTO ATTATTQGTT QTTGC36GCAT TGCOCTGACT GCGGAGT6CA TCrTCTTTGT 120 

ATCTGACCAA CACAGCCTCT AOXACTGCT TGAAGCCACC GACAACGATG ACATCTATGG 180 

GQCTGCCTGQ ATCX3GCATAT TTGTGGGCAT CTGCCTCTTC TGCCTGTCTG TTCTAGGCAT 240 

TGTA6GCATC ATGAAGTCCA 6CAG6AAAAT TCTTCTGGCG TATTTCATTC TGATGTTTAT 300 

AGTATATGCC TTTGAAGTGG CATCTTGTAT CACAGCAGCA ACACAAGGAQ ACXTTTTCAC 360 

ACCCAACCTC TTCCTGAAGC A6ATGCTAGA GAGGTACCAA AACAACAGCC CTCCAAACAA 420 

TGATGACCAG TGGAAAAACA ATGGAGTCAC CAAAACCTQG GACAGGCTCA TGCTCCAGGA 480 

CAATTGCTGT GGCGTAAATG GTCCATCAGA CTGGCAAAAA TACACATCTG CCTTCCGGAC 540 

TGAGAATAAT GATGCT6ACT ATOCCTGOOC TGQTCAATGC TQTGTTATGA ACAATCTTAA 600 

AGAACCTCTC AACCTGGAGG CTTGTAAACT AGGGGTGCCT GGTTTTTATC ACAATCAG6 G 660 

CTGCTATGAA CTGATCTCTG GTCCAATQAA CCGACAOGCC TGGGGGGTTG CCTGGTTTGQ 720 

ATTTGCCATT CTCTGCTGGA CTTTTTG6GT TCTCCTGGGT ACCATGTTCT ACTQQAOCAG 780 
AATTGAATAT TAAGAA 

8eq ID KOi 535 Protein sequence 
Protein Accession #i l!7P_008883 . l 

1 11 21 31 41 51 

I I I I 1 1 
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WO 02/086443 

MAKDNSTVRC FQGIiLIFCTIV IIGCOGIALT AECIFPVSDQ HSLYPIiLEAT DNDDIYGAAW 
IGIPVGICLP aiSVWlVQI MKSSRiaLlA YPILMPIVyA PBVASCITAA TQRDPPTMH. 
FLKQMLERYQ NNSPFNNDDQ VKNNGVTKTN DRLMLQDKCC GVMGPSDWQK YTSAFRTENN 
DADYPHPSQC CVHIOIZiKBPL ISLEI^CKLSVP GPyBMQGCYB LISGFMNRBA W3VAWP6FAI 
LCHTFNVLL6 TMFYWSRIEY 

Seq ZD NO: 536 DNA sequence 

Nucleic Acid Accession «: MM_002638.1 

Coding sequence: 120. .473 



PCT/US02/12476 



60 
120 
IBO 
240 



1 
I 

CAATACAGCT 
6CTGGACTGC 
TGAGGGCCAG 
AGGCAGCTGT 
TCAATGGACA 
GGCAAGAGCC 
TCCGGT6CGC 
TCAAGAAGTG 
CX3GTCCTTGC 
TQCTGGCCTT 
GAOCTGCCTC 



11 

I 

AAG6AATTAT 
ATAAAGATT6 
CAGCTTCTTG 
CACGGOAGTT 
A0ATCC08TT 
AGTCAAAGGT 
CATGTTGAAT 
CTGTGAAGGC 
TGCACCTGTQ 
CCCCTTCCCA 
TCTCATCCAC 



21 
I 

CCCTTGTAAA 
GTATGGCCTT 
ATCGTGGTGG 
CCTGTTAAAG 
AAAGGACAAS 
CCAGTCTCCA 
CCCCCTAACC 
TCTTGCGGGA 
CCX3TCCCCAG 
CACTGTCCAT 
TTTCCAATAA 



31 
I 

TACGAGAGAC 
AGCTCTTAGC 
TGTTCCTCAT 
GTCAAGACAC 
TTTCAOTTAA 
CTAAGCCTGG 
GCTGCTTGAA 
TGGCCTGTTT 
AGCTACAGGC 
TCTTCCTCCC 
A 



41 
I 

C90GCCCTGGA 
CAAACACCTT 
CX3CTGGGAC6 
TGTOVAAGGC 
AGGTCAAGAT 
CTCCTGOCCC 
AQATACTGAC 
OGTTCCCCAG 
CCCATCTG6T 
ATTCAGGAT6 



51 

1 

GCCASGCCAA 
CCTGACACCA 
CTGGTTCTAG 
CGTGTTCCAT 
AAAGTCAAAG 
ATTATCTTGA 
TGCCCAGGAA 
TGAAGGGAGC 
CCTAAGTCCC 
CCCACX3GCTG 



Seq ID NO: 537 Protein sequence 
Protein Accession #: NP_002629.1 



51 



1 11 21 31 41 

I I I I •) I 

MRASSPIiIW VFLIAGTIiVL EAAVTGVPVK GQDTVKGRVP F39GODPVKGQ VSVXGODKVX 
AQEPVKGPVS TKPGSCPIIIj IRCAMLNPPK ROjKDTDCPG ZKKCCEGSCQ MACFVPQ 

8eq ID NO: 538 OKA sequence 

NUcXeic Acid Accession #: NN_C01793.2 

Coding sequence: 71.. 2560 



AAAGGGGCAA 
CTCTGCAGCC 
CTGGCTGCAG 
CTTGGAGGCG 
CTGCCXn'GGG 
TGGCQA6ACA 
ATGCAAAG8T 
TGAAAATGGC 
AGftCACCAAG 
CTTOGCTGTA 
GATTGCCAAG 
CCCCATGAAC 
66ACACCTTC 
GACAGCCACQ 
CCATAGCCAA 
CACCATCAGC 
CATCCAQGCC 
GATCCTTGAT 
GCCTGAGAAT 
CAACTCACCA 
TACCATCACC 
TnTGAGGCC 
QCTQAAGCTC 
ACCTGTGTTT 
GCCT6TGTGT 
CATCCTGAGA 
TGIGGGCACC 
G6TCTT6GCC 
ACTGATTGAT 
CCAAAGCCXn' 
CCCTTTCCAQ 
G6AAGGTGAC 
GCACCTTTCT 
GTGOGACTGC 
CCCTGTGCTG 
GAGAAAGAAG 
CGTCTTCTAC 
GCTCCACOGA 
CATCATCCCG 
TATAATTSAO 
CrPGGTGTTC 
CTCCGCCTCC 
GAAGCTGGCA 
GGGACCAAAC 
QACTTCGaAG 
ACGTTAGA6T 
AGCACTGAAA 
TCTTACCTGC 
TACAGTGGAC 



11 
I 

GAGCTGAG06 
ATGGGGCTCC 
TGCGC6GCCT 
GGAGGCGCGG 
CAA6AGCCAG 
GTOCAGQAAA 
AT CTTA CGAA 
AAGGGTCCCT 
ATTTTCTACA 
GAGAAGGAGA 
XATGAGCTCT 
ATCTCCATCA 
OGAGGGAGTG 
GATGAGGATG 
GAACCAAAGG 
GTCATCTCCA 
ACAGACATQ6 
GCCAATQACA 
GCA6TGGGCC 
GCOTGGCGTG 
ACCCACCCTG 
AAAAACGAGC 
CCAACCTCCA 
GTCCCACCCT 
GTCTACACTG 
GACCCAGCAG 
CTCSACCGTG 
ATG6ACAATG 
6TCAATGACC 
GTGCGCCAGQ 
GCCCA6CTCA 
ACAGTGGTCT 
CTGTCTGACC 
CATGGCCATG 
GGGGCTGTCC 
C6GAAGATCA 
TATGGCGAAG 
GGTCTGGAGG 
ACACCCATGT 
AACCTGAASG 
GACTATGAGG 
GACCAAGACC 
GACATGTACG 
GTCAGGCCAC 
CnGTCAQGA 
GGTTGCTTCC 
ACCTCTCCAC 
OGTAAAATGC 
TTTCTCTCTG 



21 
I 

GAACACCGGC 
CTOgT OGftOC 
COGAGCCGTG 
AGCA GGAG CC 
CTCTGTTTAG 
OAAGGTCACT 
GACACAAOAG 
TCCCCCAGA8 
GCATCACGGG 
CAGGCTGGTT 
TTGGCCACGC 
TCGTGACOGA 
TCTTA6AGGG 
ATGCCATCTA 
ACCCACAOGA 
GTGGCCTG6A 
ATGQGGAGGO 
ATGCTOCCAT 
ATGAGGTGCA 
CCACCTACCT 
AGAGCAACCA 
ACACCCTGTA 
CAQCCACCAT 
CCAAAGTCGT 
CAGAAGACCC 
0GTGGCTA6C 
AGGATGAGCA 
GAAGCCCTCC 
ATGGCCCAGT 
TGCTGAACAT 
CAGATGACTC 
TQTCCCTGAA 
ATGGCAACAA 
TOGAAACCTG 
TGGCTCIGCr 
AOGAGCCCCT 
AGGGGGGTGG 
CCAGGCOGGA 
AC06TCCTC6 
C06CTAACAC 
GCA6GGGCTC 
AAGATTACGA 
GTGGCGGGGA 
AOAGCATCTC 
AGT6GCGGTA 
TTAGCCTTTC 
CTGGGCCAGG 
TCAACCCTGT 
GAATGGAACC 



31 
1 

COGGCGTCGC 
TCTGGGQTCT 
CGGGGGGGTC 
OSGCCAGGGG 
CACTGATAAT 
GA AGGAftA GG 
AGATTOGGTO 
ACIGAATCAG 
GC06GGGGCA 
GTTGTTGAAT 
TGTGTCAGAG 
CCAGAATGAC 
AGTCCTACCA 
CACCTACAAT 
CCTCATGTTC 
CGGGGAAAAA 
CTCCACCACC 
6TTTGACCCC 
GAGGCTGAGG 
TATCATGGGC 
GGGCATCCTG 
CGTTGAAGTG 
A6TGGTCCAC 
TGAGGTCCAG 
TGACAAGGAG 
CATGGACCCA 
6TTTGTGAGG 
CACCACTGGC 
CCCTGAGCCC 
CACG6ACAA6 
AGACATCTAC 
GAAGTTCCTG 
AGAGCAGCTG 
CCCTGQACCC 
GTTCCrCCTG 
CCTACrCCCA 
OQAAGAGGAC 
GGTGGTTCTC 
GCCAGCCAAC 
A6ACCCCACA 
0GACGCC6C6 
TTATCTGAAC 
GGACGACTAG 
CAAGGGGTCT 
QCAACTT66C 
AGGATGGAGG 
GTTGCCTCAG 
GTCCTGGGCC 
TTCTTAQGCC 



41 . 
I 

GGCA6CTGCT 
CTCCTCCTTC 
TTCAGGGAGG 
CTGGGGAAAG 
GATGACTTCA 
AATCCATTGA 
GTTGCTCCAA 
CTCAAGTCTA 
GACAGCCCCC 
AAGCCACTGG 
AATGGTGCCT 
CACAA6CCCA 
G6TACTTCT0 
GGGGTGGTTG 
ACCATTCACC 
GTCCCTGAGT 
AOGGCAGIGG 
CAGAAGTA06 
GTCACTGATC 
GGTGACGACG 
ACAACCAGGA 
ACCAACGAGG 
GTGGAGGATG 
GAGGGCATCC 
AATCAAAAGA 
GACAGTGGGC 
AACAACATCT 
ACGGGAACCC 
CGTCAGATCA 
GACCTGTCTC 
TG6ACGGCAG 
AAGCAGGATA 
AC6GTGATCA 
TGGAAGGGAG 
CT(S3TGCIGC 
GAA6AT6ACA 
CAGGACTATG 
CGCAATGAOG 
CCAGATGAAA 
GCCCCGCCCT 
TCCCTGAGCT 
GAGTGGGGCA 
GCGGCCTGCC 
CAGTTCCCCC 
GGAGACAGQC 
AATGTGGGCA 
AGGCCAAGTT 
TGGGCCT6CT 
TCCTGGTGCA 



51 . 
1 

TCACCCCTCT 
TCCAGGTTT6 
CTGAAGTGAC 
TATTCATGGG 
CTGTGCGGAA 
AGATCTTCCC 
TATCTGTCCC 
ATAAAOATAG 
CTGAGGGTGT 
ACCGGGAGGA 
CAGTGGAGGA 
AGTTTACCCA 
TGATGCAGGT 
CTTACTCCAT 
GGAGCACAGG 
ACACACTGAC 
CAGTAGTGGA 
AGGCCCATGT 
TGGAOSCCCC 
GGGACCATTT 
AGGGTTTGGA 
CCCCTTTTGT 
TGAATGAGGC 
CCACTGGGGA 
TCA6CTACCG 
AGGTCACAGC 
ATGAAGTCAT 
TTCTGCTAAC 
CCATCTGCAA 
CCCACACCTC 
AGGTCAACGA 
CATATGACGT 
GGGCCACTGT 
GTTTCATCCT 
TTTTGTTGGT 
CCCGT6ACAA 
ACATCACCCA 
T6GCACCAAC 
TCGGCAACTT 
ACGACACCCT 
CCCTCAOCTC 
GCCGCTTCAA 
TGCAGGGCTG 
TTCAGCTGAG 
TATGAOTCTG 
GTTTGACTTC 
TCCAGAA6CC 
GTGACTGACC 
ACTTAATTTT 



60 
120 
180 
240 
3O0 
360 
420 
460 
540 
600 



60 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 



391 
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WO 02/086443 

TTTTTTTAAT GCTATCTTCA AAA08TTAGA QAAAGTTCTT CAAAAGTGCA 6CCCA6AGCT 
GCTGGGCCCA CTGGCCGTCC TGCATTTCTG GTTTCCAGAC CCCAATGCCT CCCATTCGGA 
TGGATCTCTG CGTTTTTATA CTQAGTGTOC CTAGGTTGCC CCTTATTTTT TATTTTCCCT 
GrTGGGTTGC TATAOATGAA GGGTGAOGAC AATG6T6TAT ATGTACTAGA ACTTTTTTAT 
TAAAOAAACT TTTCCCAGAA AAAAA 

Seq ZD NOt 539 Protein sequence 
Protein Accession ft : NP_0017B4.2 



PCT/US02/12476 



1 

I 

KGLPRGPbAS 
QEPALFSTDN 
JCGPFPQRLNQ 
YELFGKAVSE 
CEDDAIYTYK 
TDMDGDGSTT 
AHRATYIilMG 
PTSTATIWH 
DPAONLAHDP 



TWIjSLKKPIj 
QAVLALLFLL 
GDBARPEWL 



11 

I 

LLLLC2VCWLQ 
DDFTVRNGET 
IiKSmCDRDTK 
HQASVBDFMN 
GWAY8IHSQ 
TAVAWEILD 
GDDGDHFTIT 
VEDVHBAPVF 
OSGQVTAVGT 
RQITICaVQSP 
KQDTVDVHLS 
LVLLLLVRKK 
R2IDVAPTIIP 
SLSSLTSSAS 



21 

1 

CAASEPCRAV 
VQERRSIjKER 
IPYSITGPGA 
ISIZVTOQND 
BPKDPHDLNP 
ANDNAPMFDP 
THPE5NQGIL 
VPPSKWEVQ 
UmEDEQPVR 
VRQVLNITDX 
LSDHGNKEQL 
RKIKEPLLLP 
TPMYRPRPAK 
DODQDYDYLN 



31 

I 

FRBAEVTLEA 
KPLKIFPSKR 
DSPPEGVFAV 
HKPKFTQDTF 
TZHRSTGTI8 
QKYEAHVPEN 
TTRKGLDFEA 
EGZPT6EPVC 
NNIYBVMVIA 
DL8FBTSPFQ 
TVIRATVCDC 
EDDTRDNVPY 
PDBIGNFIIE 
EWGSRFKKLA 



41 

1 

GGAEQEPGQA 
IIiRRHKRDHV 
EKETGiaiLLN 
RGSVLEGVbP 
VISSGIiDREK 
AVGREVQRLT 
KNQHTLYVEV 
VYTAEDPDKE 
KDNQSPPTTQ 
AQIiTDDSDJy 
HGHVBTCPOP 



.Seq ID NO: 540 DNA sequence . 
Nucleic Acid Accession #t Eos sequence 
Coding sequence i 1..672 



ATOAGGCTCC 
OGGGGCTCCC 
AAGGGOdOGEG 
CTGCTOGCCr 
GOQAGACAAC 
TGTCATGTTT 
ACAGAGCCAT 
AAGCAGTGCT 
CTCCTGGAA6 
TTAGAG6G6C 
AGCTGTQGTG 
AGCCTGTCTT 



11 

! 

AAAGACCCCG 
CCTACCXSGCC 
AGG0G6O8OC 
TGCTGCT6CST 
GAGATCCAGA 
GTGAGA6AGA 
ACTGCGTTAT 
CGGCTGGTTG 
A6CCCATG0C 
CACCTATCAA 
GGCT6T0GCT 
GA 



21 

I 

ACAGGCCCXX3 
AGACCCGGGG 
6CSGC6CI6AC 
CGTG6CCCTA 
GGACTCCCAG 
AAACACTTTC 
AGCXSGCCGTG 
T6CAGCGATG 
CTTCTTTTAC 
CTCATGA0T6 
G8CCAT0CTC 



31 

I 

gcgggtggga 
a6aggcgc6c 
cctccciggg 
cogcggg to t 
osaacggaoq 
qagtgcca6a 
aaaatatttc 
qaoagagcca 
ctcaaqtqtt 
ttcaaagaat 

CTGCT6CIG0 



HLKAAKTDPT 
DMYQGGEDD 



41 

I 

GGCGC3GCGCC 
GGAG6CTG06 
GAOCXSCTOOS 
GOACAGACSC 
AGGQT6ACAA 
ACCCAAGGAG 
CACGTTTTTT 
AGOCAGAGGA 
GTAAAATTCG 
ATGCVGG6AG 
CCTCCATTGC 



SI 
I 

LGKVPMGCPG 
VAPZSVPENG 
KPLDREEIAK 
GTSVMQVTAT 
VPBYTLTZQA 
VTDLDAPNSP 
TNEAPFVLKL 
NQKISYRZIA 
TGTLIiLTZilI} 
HTAEVMBESD 
HKGGPILFVL 
QDYDITQLHR 
APPYDTLLVF 



51 
I 

CCGGGGCGGG 
AAGGTTCCAG 
GACGAT8GCG 
CAACCTGACT 
TAGAGTGTGG 
GTGCAAATGG 
CATGGTTGCG 
GAAGCXSGTTT 
CTACTGCAAT 
CAT6GGT6AG 
AOCOOQCCTC 



Seq ZD NO I 541 Protein sequence 

Protein Accession Eos sequence 

21 



51 



I 11 21 31 41 

i I I 1 1 I 

MRLQRPRQAF AGGRBAFRGG RQSPYRFDPG RGARRLRRFQ RCGEGAPRAD PPWAPLGTMA 
LLALLLWAL PRVWTDANLT ARQRDPEDSQ RTDEGDNRVW CHVCERENTP ECQNPRRCKW 
TEPYCVIAAV KIFPRFFMVA KQCSAGCAAM ERPKPEBKRF LZiEEPMPFFY LKCCKZRYCN 
UaaPPWSSV PKEYAGSMGE SOSCLKLAZIi ZiLLASIAAGL SL3 

Seq ZD KOt 542 DNA sequence 

Nucleic Acid Accession «: XM_035292.2 

Coding sequence: 53.. 1576 



GCTOGCTGGG 
tggqggccog 
GGAGAAGATG 
CGTGACCCTG 
TATCGGCTCQ 
GCTQGOQCTG 
GGG6GAGCTC 
CTAOGGCTOG 
ATCGCaOTAC 
CTGCXrCGGTG 
GGCGGTGAAC 
CAAGCTCCIG 
TGTGTCCAAT 
TGTGCTG6CA 
CACAGAGGAA 
CATCGT6AC6 
GCAOATQCTQ 
GTCCT66ATC 
GTTCACATCC 
CTCXATQATC 
GACGCT6CTC 
CAACTG6CTC 
TGAGCTTGAG 
CCTCTTCCTG 
CATCATCCTC 
GTGGCTCCTC 



11 

I 

CGGCG6CTCC 
AA60GG000Q 
CT6GCC6CCA 
CAGCGGAACA 
GGCATCTTCG 
GTGGTGTGGG 
GGCACCACCA 
CTGCCG6CCT 
ATCQTGGCCC 
CCCGAGGAGG 
TGCTACAGCG 
GCCCTGGCCC 
CTAGATCCCA 
TTATACAGCG 
ATGATCAACC 
CTGGTGTACG 
TCXSTCGOAGG 
ATCCCCMTCT 
TCCAGGCTCT 
CACCCACAGC 
TAGGCCTTCT 
7G0QTG6CCC 
066CCCATCA 
ATCGCCX3TCT 
AGCGGGCTGC 
CAGGGCATCT 



21 
I 

GGGGTGTCCC 
GQCTAOOQGC 
AOAGCGCQGA 
TCAOQCTGCT 
T6ACGCCCAC 
CCG06TG06G 
TCTCCAAATC 
TCCTCAAGCT 
TGGTCTTC3GC 
CAGCCAAGCT 
TGAAGGCGGC 
TGATCATCCT 
ACTTCTCATT 
GCCTCTTTGC 
CCTACAOAAA 
TGCTGACCAA 
CGOrGGCGGT 
T06TGGGCCT 
TCTTOGTGGG 
TCCTCACCCC 
CCAAG6ACAT 
TGGCCATCAT 
AG6T6AACCT 
CCTTCTGGAA 
CCGTCTACTT 
TCTCCAOGAC 



31 

i 

AGGCCCX3GCC 
GCG8GCX3QOC 
GGGCTGGGOG 
CAAGGGCGTG 
GGGOCrTGCTC 
CGTCTTCTCC 
GG60GGCGAC 
CTOGATGGAG 
CACCTACCTG 
CGTGGCCTGC 
CACC06GGTC 
6CTGGGCTTC 
TGAAGGCACC 
CTATGGAGGA 
CCTGCCCCTG 
CCTQGCCTAC 
GGACTT06GG 
GTCCTQCTTC 
GTCCOGGGAA 
CGTGCCGTCX: 
CTTCTCOGTC 
GQGCATGATC 
GGCCCTGCCT 
GACACCCGTG 
CTTCGGGGTC 
OGTCCTGTGT 



41 

I 

GGTGCXKAGA 
GASGAOAAGG 
CCGGCAGG06 
GCCATCATOG 
AAGGAGGCAG 
ATCXSTGGGOG 
TA06CCTACA 
CTGCTCATCA 
CTCAAGCOGC 
CTCTGCX3TGC 
CAGGATGCCT 
GTGCAGATGO 
AAACTGGAT6 
TGGAATTACT 
GCCATCATCA 
TTCACCACCC 
AACTATCACC 
GGCTCOGTCA 
G6CCACCTGC 
CTCGTQTTCA 
ATCAACTTCT 
TGGCTGGQCC 
GTCTTCTTCA 
GAGTGTGGCA 
TGGTGGAAAA 
CAGAAGCTCA 



51 

I 

GCATGGCGGG 
AAGAGGOGOG 
AGGGGGAGGG 
TGGGGACCAT 
GCTCGCOGGG 
CGCTCTGCTA 
TGCTGGAGG7 
TCCGGCCTTC 
TCTTCCCCAC 
TGCTGCTCAC 
TTGCCGCCGC 
GAAAG6GTGA 
TGGGGAACAT 
TGAATTTOST 
TCTCCCTGCC 
TGTCCACaSA 
T6GG0GTCAT 
ATGGGTCCCT 
CCTCCATOCT 
CGTGTGTGAT 
TCAGCTTCTT 
ACA6AAAGCC 
TCCTGGCCTG 
TOSGCTTCAC 
ACAAGCXrCAA 
TGCAGGTGGT 



3000 
3060 
3120 
3180 



60 
120 
IBO 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 



60 
120 
180 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 



392 



5 



45 



60 



80 



WO 02/086443 PCTAJS02/12476 

CCCCCAGGAG ACATAGCCAG GAGGCCGAGT GGCTGCCGOA GGA6CATGC 

Seq ID NO: 543 Protein sequence 
Protein Acceseion #: XP_035292.2 



Seq ID NOt 545 Protein sequence 
Protein Accession #: NP_0052S9.1 



Seq ID KO: 546 QNA sequence 

Nucleic Acid Accession #: l!D«_002391. 

coding sequence; 26.. 457 



1 11 21 31 41 51 

MA6AGPKRRA IiAAPAAEEKE EAREKMLAAR SASG5APAGB GEGVTLQIUiZ TLLNGVAHV 60 

OTIIGSGIFV TPTGVLKEAG SPGLALWWA ACGVFSIVGA LCYAELGTTI SKSGGDYAYM 120 

10 LBVYGSLPAP LKbWIELLII RPSSQYIVAL VFATYLLKPL FPTCPVPEBA AKLVACLCVL 180 

LLTAVMCYSV KAATRVQDAF AAAKLLALAIi IILLGFVQIG KGDVSNIJ3PN FSPEGTKIiDy 240 

GUnVLALYSG LEAYGGHliyL MFVTEBMINP YHHLPIAIII SliPIVTLVYV LTNLAYFTTL 300 

STBQMLSSEA VAVDPGNYHL GVMSWIIPVF V6I»SCFGSVN GSLFTSSRLP FVGSRBGHLP 360 

SILSMIHPQL LTPVPSLVPT CVMTLLYAFS KDIFSVIMPF SFFKMLCVAL AIlOIIMliRH 420 

15 RKPBLERPIK VNLALPVFPI LACLPLIAVS PWRTPVEOSI OPTIILSGLP VYPPQVWWMI 480 
KPKWLLQGIP STTVLCQKIiM QWPQET 

Seq ID NOt 544 DNA sequence 
Nucleic Acid Accession #t NM_005268.1 
20 Coding sequence: 168.. 989 

1 11 21 31 41 51 

TAAAAAGCAA AAGAATTCGC GGCCGCGTCG ACACGGGCTT CCCCGAAAAC CTTCCCC6CT 60 

25 TCTGGATATG AAATTCAAGC TGCTTGCTGA GTCCTATTGC CGGCTGCTGG GAGCCAGGAG 120 

AGCCCTGAGG A6TAGTCACT CAGTAGCAGC TGACGCGTGG GTCCACCATG AACTGGAGTA 180 

TCTTTGAGG6 ACTCCTGAGT OGGGTCAACA AGTACTCCAC AGCCTTTGGG CGCATCTGGC 240 

TGTCTCTGGT CTTCATCTTC CGOGTGCTGG TGTACCTGGT GACGGCCGAG CGTGTGTGGA 300 

GTGATGACCA CAAGGACTTC GACTGCAATA CTCGCCAGCC CGGCT6CTCC AA06TCTGCT 360 

30 TTGATGAGTT CTTCCCTGTG TCCCATGTGC GCCTCTGGGC CCTGCAGCTT AICCTGGTOA 420 

CATGCCCCTC ACTGCTCGTG GTCATGCAOQ TGGCCTACCG GGAGGTTCAG GAGAAQAGGC 480 

AGCX3A6AAGC CCATOSGGAG AACaGTCGGC GCCTCTACCT GAACCCCGGC AAGAAGCGGG 540 

GTGGGCTCTO GTOGACATAT 6TCTGCAGCC TAGTGTTCAA GGCGAGCGTG GACATCGCCT 600 

TTCTCTATGT GTTCCACTCA TTCTACCCCA AATATATCCT CCCTCCTGTG 6TCAAGTGCC 660 

35 ACGCAQATCC ATGTCCCAAT ATAQTGGACT QCTTCATCTC CAAGCCCTCA GAGAAGAACA 720 

TTTTCACCCT CTTCATGGTG GCCACAGCTG CCATCTGCAT CCTGCTCAAC CTC3QTG6AGC 780 

TCATCTACCT 03TGRGCAAG AGATGCCACG AGTGCCTGGC AGCAAGGAAA GCTCAAGCCA 840 

TGTOCACAGG TCATCACCCC CACGGTACCA CCTCTTCCTG CAAACAAGAC GACCTCCTTT 900 

CGGGTGACCT CATCTTTCTG GGCTCAGACA GTCATCCTCC TCTCTTACCA GACCGCCCCC 960 

40 GAGACCATGT GAAGAAAACC ATCTTGTGAG GGGCTGCCIG GACTGGTCIG GCAGGTTGOO 1020 

CCTGGATGGG GAGGCTCTAG CATCTCTCAT AGGTGC3U10C TQAGAGTGGS GGAGCTAAGC 1080 

CATGAGGTAG GGGCAGGCAA GAGAGAGGAT TCAGACGCTC TGGGAQCCAG TTCCTAGTCC 1140 

TCAACTCCAQ CCACCTGCCC CAGCTCGACG GCACTGGGCC AGTTCCCCCT CTGCTCTGCA 1200 
GCTCGGTTTC CTTTTCTAGA ATGGAAATAG TGAGGGCCAA TGC 



1 11 21 31 41 51 

SO I I I I I t 

MNWSIPEGLL SGVNKYSTAP GRIWLSLVFI FRVLVYLVTA EKVWSDDHKD PDCNTRQPGC 60 

SNVCFDEPFP VSHVRLWALQ LILVTCPSLL WMHVAYREV QEKRHREAHG EMSGRLYLNP 120 

GKKRGGLWWT YVCSLVPKAS VDIAPLYVFH SFYPKYILPP WKCHADPCP NIVDCPISKP ISO 

SEKNIFTLFM VATAAICILL NLVBLIYIiVS KRCHBCLAAR KAQAMCTGHH PHGTTSSCKQ 240 
55 DDItLSGDLIF LGSDSBPPLIi PDRPBDHVXK TIL 



1 11 21 31 41 51 

CQGGCGAAGC AGCGCGGGCA GCGAGATGCA GCACCGAGGC TTCCTCCTCC TCACCCTCCT 60 

OGCCCTGCTG GCGCTCACCT CCGCGGTCGC CAAAAAGAAA GATAAGGTGA AGAAGGGCGG 120 

65 C0C6GGGAGC GAGTGCGCTQ AGTGQGCCTQ GGGGCCCTGC ACCCCCAGCA GCAAGGATTG 180 

CGGCGTGGST TTCOGCGAGG QCACXITaOGG GGCCCAOACC CAGCGCATCC GGTGCAGGGT 240 

GCCCTGCAAC TGGAAGAAGG AGTTTGGAGC CQACTGCAAG TACAAGTTTG AGAACTGGGG 300 

TGCGTCTGAT GGGGGCACAG GCACCAAAGT CCGCCAAGGC ACCCTGAAGA AQQOQOGCTA 360 

CAATGCTCAG TGCCAGGAGA CCATCOGCGT CACCAAGCCC TGCACCCCCA AQACCAAAGC 420 

70 AAAGOCCAAA GCCAAGAAAG GGAAGGGAAA GQACTAGACG CCAAGCCTGG ATGCCAAGQA 480 

GOCCCTGGTQ TCACATOGGG CCTGGCCACG CCCTCCCTCT CCCAGGCCCG AGATGTGACC 540 

CACCAGTGCC TTCTGTCTGC TOGTTAGCTT TAATCAATCA TGCCCTGCCT TOTC CCTCTC 600 

ACTCCCCaGC CCCACCCCTA AGTGCCCAAA GTGGGGftGGO ACAAGGGATT CTOOGftAGCT 660 

TQAGCCTCCC CCAAAGCAAT GTGAGTCCCA GAGCCCQCTT TTGTTCTTCC CCAC AATTCC 720 

75 ATTACTAAGA AACACATCAA ATAAACTGAC TTTTTCCCCC CAATAAAAGC TCTTCTTITT 780 
TAATAT 



Seq ID NO: 547 Protein sequence 
Protein Accession #: NP_0023B2.1 



11 21 31 41 • 51 

I I lit I 

MQHRGFUjLT llallaltsa vakkkdkvkk ggpgsbcabh AWGPCTPSSK DOSVGPREST 60 
CGAQTQRIRC RVPCHWKKEF GADCKYKPEK WGACDGGTQT KVRQGTbKKA RYNAQGQBTl 120 

85 RVTKPCTPKT KAKAKAKKGK GKD 



Seq ID NO: 548 DNA sequence 



393 



wo 02/086443 
Nucleic Acid Accession #: 
Coding sequence i 1 . . 786 



PCT/US02/12476 



NH 006783.1 



5 

10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



ATGQATTGGG 
GGQAAGGTGT 
GAQOAAGTGT 
AAAAATOTGT 
CTGATCTTOG 
GAAACCACrC 
ATTAAAAAGC 
TTTTTCCGAA 
TACCACCTGC 
TTTATTTCTA 
ATTT6CATQC 
AGATCAAAGA 
CAGAATQAAA 
AGCTAA 



11 
I 

GGACQCTGCA 
GGATCACAGT 
GGGGTGACGA 
GCTATGACCA 
TCTCCAOCCC 
GCAAGTTCAG 
ACAAGGTTCG 
TCATCTTTGA 
CCTGGGTGTT 
GGCCAACAOA 
TGCTTAAOST 
OAGCACAQAC 
TQAATGAGCT 



21 

I 

CACTTTCATC 
CATCTTTATT 
GCAAGAGGAC 
CTTTTTCCOG 
AGOGCTGGTO 
6CGAGGAGA0 
GATAGA6GGG 
AQCAGCCTTT 
GAAATGTGGG 
GAAGACGQT8 
GGCAOAGTTG 
GGAAAAAAAT 
6ATTTCAGAT 



31 

I 

GGGGGTGTCA 
TTC06AGTCA 
TTCGTCTGCA 
6TGTCCCACA 
OTOGOCATGC 
AAGAGQAATO 
TCGCTQTGGT 
ATGTATQTGT 
ATTGACGCCT 
TTTAOCATTT 
T6CTACCTGC 
CACOOCAATC 
AGTGGTCAAA 



41 

I 

ACAAACACTC 
TGATCCTAGT 
ACACACTQCA 
TCCGGCTGTQ 
ATGTGOCCrA 
ATTTCAAAGA 
GGA06TACAC 
TTTACTTCCT 
GOCGCAACCT 
TTATGATTTC 
T6CTGAAA6T 
ATGOCCTAAA 
ATGCAATCAC 



51 
I 

CACCAGOVTC 
GGTQGCTQCC 
ACCX3GGATQC 
G6CCCTCCAG 
CTACAOGCAC 
CATAGA6GAC 
CAGCAGCATC 
TTACAATGGG 
TGTTGACT6C 
TGOGTCTQTO 
GTGTTTTAQG 
GGAOAGTAAO 
AGGTTTCCCA 



Seq ID KO: 549 Protein sequence 
Protein Accession NP_0 067 74.1 

1 11 21 31 41 51 

I I I I I I 

NDHGTLHTFI GGVNXHSTSZ (^CVWITVIFI FRVNILWAA QEVWGDEQED FVCNTL0F6C 
KNVCYDHPFP VSHIRLWALQ. LIPVSTPALL VAMHVAYYRH ETTRKFRRGB KRNDPKDIED 
IKKHKVRIEG SLWWTYTSSI PFRIIFEAAP MYVFYPLYNG YHLPWVLKCG IDPCPNLVDC 
FISRPTEKTV PTIFNISASV I01LLNVAEL CVLLLKVCFR RSKRAQTQKH HFNHALKBSK 
QNEIQIELISD SGQtTAITGFP S 

Seq ID NO: 550 tSUlA sequence 

Nucleic Acid Accession l^i NM_002571.1 

Coding sequence: 9 9.. 58 7 



CATCCCTCTG 
TCACCCTGGG 
A6GACCTGGA 
ACATCTCCCT 
CCACCCCOGA 
AGAAGAAGGT 
TGGOQAACQA 
AGQftCAOCAC 
AGGACQATSA 
6GTACTTGCT 
CCAGGAAGAC 
TTTCAAAGAA 
TCCT6CTGCA 
GCAGAGGTTA 



11 

I 

GCTCCAGAGC 
GGTGGCCCTG 
GCTCCCAAAG 
CAltSGOGACA 
GGACAACCTG 
CCTTGGAGA6 
6GCCACGCTG 
CAGCCCCATC 
GATCATGCA6 
GGACTTGAAA 
CAGACTCCCA 
TAACCACAGC 
CACCT6CACC 
TTAATAAACC 



21 

f 

TCAGAGCCAC 
GTCTGTOGTG 
TSQQCMBQQk 
CTQAAGGCCC 
GAGATCGTTC 
AAGACTGGGA 
CTCQATACTG 
CAOAGCATOA 
GGATTCATCA 
CA6ATGGAA6 
CCCTTCCACA 
TCAGAAGAOG 
ATTGCCATGO 
CTTGGAGGAT 



31 

I 

CCACAGCCGC 
TCCGGGCCAT 
CCTOOCACTC 
CTCTGAGGGT 
T6CACAGATG 
ATCCAAAGAA 
ACTAOGACAA 
TGTGOCAGTA 
QGGCTTTCAG 
AGCCGTGCGG 
GCTCCAGAGC 
ATGAGGTGGT 
GGAQGCTQCT 
G 



41 
I 

AGCCATGCTG 
QGACATCCCC 
CATGGOCATG 
CCACATCAOC 
GGAGAACAAC 
6TTCAAGATC 
TTTCCTGTTT 
CCrGGCCAGA 
GCCCCTGCCC 
TTTCTAGCTC 
AGTGGGACTT 
CATCTGT6TC 
CCCtGOGGQC 



51 

I 

TGCCTCCTGC 
CAGACCAAGC 
GOGACCAACA 
TCACTGTTGC 
AGCTGTGTTG 
AACTATACQO 
CTCTGCCTAC 
GTCCTGGTGG 
AGGCACCTAT 
ACCTCCGCCT 
CCTCCTGCCC 
GCCATCCCCT 
AQAGTCTCT6 



Seq ID NO: 551 Protein sequence 
Protein Accession #t NP 002562.1 



1 11 21 31 41 51 

I I I I I I 

MDIPQTKQDL BLPKLAGTWH SMAMATNNZ8 LNATLKAPLR VHITSLLPTP EDNLEIVLHR 
KENNSCVEKK VLGBKT6NPK KFKIMYTVAN BATLLDTDYD NFIiFLCLQDT TTPIQSMHOQ 
YLARVLVEDD EIMQGFIRAF RFLPRHLHYL LDIiKQHBEPC RP 

Seq ID NO: 552 DKA sequence 

Nucleic Acid Accession #t HM_006500.1 

Coding sequence: 2 7.. 1967 



1 
I 

ACITGOGTCT 
TOGCCGCCTG 
CGCCTGAGCT 
AGTCCCAAGG 
TCATCTTCCG 
TCAGCCTCCA 
6CATCTTCTT 
TCTACAAAGC 
GTAAQGAGCC 
TCATCTGGTA 
OSTOOCAGAC 
TGGTTAAAGA 
GGAACCACAT 
TGTGGCTGGA 
GTTTGGCTGA 
GGQA6GCA6A 
A6GAACACA6 
TGAGTGAACC 
■ CCCCTGAGAG 
ACCTCGAGTT 



11 
I 

CGC0CTCCG6 
CTGCT6CTGT 
GGTGGAGGTG 
CAACCTCAGC 
TGTGOGCCAG 
GGACAGA6GG 
GTGCCA6GGC 
TCOGGAGGAG 
TGAGGAGGTC 
CAAQAAT66C 
TSTOQAOTOS 
AQACAAA^T 
GAAGGAGTCC 
AGTGGAGCCC 
TGGCAACCCT 
GGAAGAGACA 
TGGG06CTA7 
ACAGGAACTA 
ACAGGAAGGC 
CCAGTGGCTG 



21 
I 

CCAAGCATGG 
0CTCG0GT06 
GAAGTGGGCA 
CATGTCGACT 
GGCCAGGGCC 
GCTACTCTGG 
AAGOGCCCTC 
CCAAACATCC 
GCTACCTGTG 
CGGCCTCTGA 
AGT8STTTGT 
GCCCAOTTTT 
AGGGAAGTCA 
GTGGGAATGC 
CCACCACACT 
ACCAAC8ACA 
OAATGTCAGG 
CTGGTGAACT 
AGCAGCCTCA 
AGAGAAGAGA 



31 
I 

6GCTT0CCAG 
CGGGT6TGCC 
GCACAGCCCT 
GGTTTTCTQT 
AGAGOGAACC 
CCCTGACTCA 
GGTCCCA6GA 
AGGTCAACCC 
TAGGGAGGAA 
AG6AG6AGAA 
ACACCTTGCA 
ACTOTGAGCT 
CCGTCCCTGT 
TGAAGGAAGG 
TCAGCATCAG 
AGGGGGTCCT 
CCTGQAACTT 
ATGTGTCTGA 
CCCTGACCTG 
CAGACCAGGT 



41 
I 

6CT6GTCTQC 
OGGAGAGOCT 
TCTGAAGTQC 
CCACAAGGAG 
TGGGGA6TAC 
AGTCACCCOC 
GTACCGCATC 
CCTGGGCATC 
CGGGTACCCC 
GAACCGGGTC 
GAGTATTCTG 
CAACTAOCGG 
TTTCTACCCG 
GGACCGOGTG 
CAAGCAGAAC 
GSTGCTGQAG 
GQACACCATO 
OGTCCGAGTG 
TGAOGCAGAG 
GCTGGAAAGG 



51 

I 

GCCTTCTTGC 
GAGCAGCCTO 
GGCCTCtCCC 
AAGGGGACGC 
GAGCAGCGGC 
CAAGAGGAGC 
CAGCTCCGC6 
CCTGTGAACA 
ATTCCTCAAG 
CACATTCAGT 
AAOGGAOISC 
CTGCCCAGTG 
ACAGAAAAAG 
GAAATCAGGT 
CCCAGCACCA 
CCTGGCCGOA 
ATATOGCTOC 
AGTCCCGCAG 
AGTAGCCAGG 
GGGCCTGTGC 



60 

120 

leo 

240 
30O 
360 
420 
480 
540 
600 
660 
720 
780 



60 
120 
180 
240 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
760 



60 
120 



60 
120 
180 
240 
300 
360 
42b 
460 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 



394 



wo 02/086443 

TTCAGTTGCA TGACCTGAAA 
CCAGCATACC GGGCCTGAAC 
66AT6GCATT CAAGGAGAGO 
GTGAAGCGTC AGGGCACCCX 
AACAAGACCA AGATCCACAG 
TGTTGGAGAC AGGTGTTGAA 
TCTTCCTGGA GCTOOTCAAT 
TCAGCACTTC CACTGCCAGT 
TGCOSGAGCC GGAGAGCCGG 
TGGCGGTGCT GGGCGCTGTC 
6CTCAGGGAA GCAGGAGATC 
TTAA6TCA6A TAAGCTCCCA 
GG6CTCGGGG AGACCAG6GA 
CAGCTCCCTT CCCTGCCTGG 
CCTCCAAAGG GACTAGAGAG 
GGCCACTGGG TTAGGACCTG 
OTCCACCAOC ATCTCCTCCA 
CCGA6CGGGT AGGAQAGTTT 
AAATACCTGO CTCCTGCCAG 
CAAAGGCTGG CTTCCACCAT 
GCCTOCTCAT GTT6AAGTGC 
AGAAGCAGCT GCAGTGTTGC 
ACATTTTTTC TTTGGTCAGA 
GGCCAGGTGT GGTGGCTCAC 
TCACAAAGTC AQGACGAGAC 
TACAAAAAAA AATTAGCTAG 
CTGAAGCAGG AGAATGGTAT 
• CACTGCACTC CAGCCTGGGC 
ACGCGTACCT GCGGTGAGGA 
TCCCCX?rGTT CACTTGCTCC 
GGGGAGCAGA OVAAGATGAG 
TTAGCACCAA ACTTCTACAA 
AGAATGGTAC TTA6GGATGG 
CTGTGTGTAT GCATACATAT 
TTGTTTCCTT TATATATOTA 
AAAGCTTAAT TGTOCCASAA 
AACCTGGGGG CCTGTGAAAC 
AQAGATCAGG GGTTACCTCT 
CTACCCTACT TTTCAGCAGC 
TGTTAOCAGG A5CTATGTCC 



CGGGAG6CAG 
GQCACACASC 
AAQQT6TGGG 
OGGCCCACCA 
QGAGTCCTGA 
TGCAC3GGCCr 
TTAACCACCC 
CCTCATACCA 
GGCGTGGTCA 
CTCTATTTCC 
ACGCTGCCCC 
GAAGAGATGG 
GAGAAATACA 
ACCATTCCCA 
AAGCCrCCTG 
AGGACCTCAC 
CGTTGAGTGA 
CTTGCAGAAC 
CAGCT6AGCT 
CCAGQTGCAC 
GCTGTTCACA 
TGCCACCACC 
AGCCAGGAAC 
GCCTGTAATC 
CATCCTQGCT 
GCGTAGTGGT 
GAATCCAGGA 
AACACAGCGA 
AGCTGGGGGC 
CATAOCCCTC 
GTCTACACTO 
ACCAAGCTCA 
AAAACGGGGC 
GTGTGTATAT 
TQTATATATA 
AATCATACAT 
TACAACCAAA 
GCTTCTGAGC 
AAAACGTCCC 
CTTOCTATOG 



GA6GCX3GCTA 
TGGTCAAGCT 
TQAAAGAGAA 
TCTCCTGGAA 
GCACCCTGAA 
CCAAG6ACCT 
TCACACCAGA 
GAGCCAACAG 
TC6TGGCTGT 
TCTATAAGAA 
CGTCTCGTAA 
GCCTCCTGCA 
TOGATCTGAG 
GCTCCCTGCT 
CTCCCCTCAC 
TTGGCCCTGC 
AGCTCATCCC 
GTGTTTTTTC 
GGGTAGCCTC 
CACTGAAGTO 
CCCX3CTC0G0 
CTCCTGCTCG 
TGGTGTCATT 
CCA6CACTTT 
AACAGGOIOA 
TG6CACCTAT 
GGTGGAGCTT 
GfiCTCOGTCT 
TGTTTT06AG 
TTGATGGATC 
TCCTTCATGG 
GGGCCCCAAC 
CTGGCTAGAG 
ATGGTTTTGT 
TA TATQAA AA 
T6CTTTTTTA 
AGGCACACAA 
AAATG6CTCA 
6TATGAGGCA 
TTTCOGTCCA 



TCGCTGOGTG 
G6CCATTTTT 
TATGGTGTTG 
CGTCAACGGC 
TGTCCTCGTG 
GGGCAAAAAC 
CTCCAACACA 
CACCTCCACA 
GATTGTGTGC 
GGGCAAGCTG 
GACOSAACTT 
GG6CAGCA6C 
GCATTAGCCC 
CACTCTTCTC 
CTGCACACCC 
AAGCCGCTTT 
AAGCAAGGAG 
TTTACACACA 
TCTGAGCTGG 
AG6ACACACC 
AGAGCAGCCC 
CCTCTTCAAA 
CCTTAAAAGA 
GGGAGGCCGA 
AACCCTGTCT 
AGTCCCAGCT 
OCAGTGAGCC 
CGAGGAAAAA 
TTCAGGTGAA 
A06TAAAACT 
GGATTAAAGC 
CCTAGAAGGG 
CTTCX3GGTGT 
CAGGT6TGTA 
TATATATATA 
TTCTACATGG 
AACCGTTTCC 
AGCTCTACCA 
GCACGAAGGG 
CTT 



GCGTCTGT6C 
GGCCCCCCTT 
AATCTGTCTT 
ACGGCAAGT6 
ACCCCGGA8C 
ACCAQCATCC 
ACCACTGGCC 
GAGAGAAAGC 
ATCCTGGTCC 
GC3QTGCAGGC 
GTAGTTQAAG 
GGTGACAA6A 
CGAATCACTT 
TCAGCCAAAG 
CCTTTCAGAG 
TCAG66ACCA 
CCCCAGTCTC 
TTATGGCTGT 
TTTCCTGCCC 
GGAGGCAGGC 
AGCGGCATCC 
GTCTCCTGTG 
TACGTGCCGG 
G6CGGGC6GA 
CTACTAAAAA 
ACTCG6AAGG 
6AGACCGTGC 
AAAAGAAAAG 
TTAGCCrCAA 
GAAAGGCAGC 
TATQGTTATA 
CCXAAATGAO 
GTGTGTCTOT 
AATTTGCAAA 
TATGAAAAAT 
GTACCACAGG 
AGTTGGCA6C 
GAGCAGACAG 
CCTGGCAG6C 



1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2620 

2880 

2940 

3000 

3060 

3120 

3160 

3240 

3300 

3360 

3420 

3480 

3540 



Seq ID MOt 553 Protein sequence 
Protein Accession |i MP_b06491.l 



GLPRLVCAPL 
WP8VKKEKRT 
RSQBYRIQLR 



TVPVFYPTBK 
NGVLVLBPAR 
TIiTCEAESSQ 
IiVKIiAlFGPP 
SniNVLVTPE 
RANSTStBRK 
PSRKTEbWE 



11 
I 

LAACCCCPRV 
LIFRVRQGQG 
VYKAPEEPNI 
SSQTVBSSGXj 
VWLBVBPVGM 
KEHSGRYECQ 
DLEFQWLRKE 
WMAFKERKVW 
LLETGVECTA 
LPSPBSRGW 
VKSDKLPEEM 



21 

I 

AGVPGBAEQP 
QSEPGGYEQR 
QVNPLQIPVN 
YTLQSILKAQ 
IJCEQDRVEIR 
AUNLDTMISL 
TDQVLER6PV 
VKENMVLNIiS 
SHDL6RMTSI 
IVAVIVCILV 
GLLQGSSGDK 



31 
1 

APBLVEVEVG 
L8UQDRGATI1 
SKEPEBVATC 
LVKEDKDAQF 
CLADGNPFPH 
LS6PQELLVN 
LQIiHDLKRBA 



LFLBLVNLTT 
LAVLGAVLYF 
RAPGDQGEKir 



41 

I 

STALLKCGLS 
ALTQVTPQDE 
VSRNGYPIPQ 
YCBLNYRLPS 
FSISKQNPST 
YVSDVRVSPA 
GGGYRCVASV 
ISKNVNGTAS 
LTPDSNTTTG 
LYKKGKLPCR 
IDLRH 



51 
I 

QSQQNLSHVD 
RIFLGQGXRP 
VXWYKZ9GRPL 
GNHMKESREV 
REAEEBTTND 
APERQB6SSL 
PSIPGXi^TQ 
EQDQDPQRVIi 
LSTSTASPBT 
RStSCQEITLP 



60 
120 
160 
240 
300 
360 
420 
460 
540 
600 



Seq ID mt 554 DKA sequence 

Nucleic Acid Accession ft: NM.003183.3 

Coding sequences 165.. 363 9 



1 
I 

TGGAGCCTGG 
GCTAGGCCGG 
GGAAG6CTGC 
TATTOCTGAC 
TOGGCCCCCA 
CTTTATCTAA 
TAGAAACACr 
GTACTGAACG' 
AGTACACTGC 
GGGTTCTAGC 
AATATAACAT 
TTTATAAATC 
ATTTAAAAGT 
AAOASCTTGT 
AATTATTGQT 
CAACTACAAA 
CATGGGATAA 
AGTCTCXACA 
ATQAA6AAAA 
CTGAG6AAGC 
GAACTCTTGG 
CAAAGGCTTA 
GCACAAAGAA 



11 

I 

CGGTAGAATC 
GCX3GATCCCG 
CCAGAGAGGT 
CAGOQTGOTT 
CCA6AGACTC 
TATCCAGCAG 
ACTAACTTTT 
TTTTTCAC3UV 
AAAATGGCAG 
CCACATAAGA 
AGAGCCACTT 
TGAAGATATC 
GGATAATGAA 
7CATCQA6TG 
GGTAGCAGAT 
TTACTTAATA 
TSCAGGTTTT 
AGAGGTAAAA 
GGATGCTTGG 
ATCTAAAGTT 
ATTAGCTTAT 
TTATAGCCCA 
TTATGGTAAA 



21 

1 

TTCCCAGTAG 
TCCTCCCCCG 
<%3AGTCaGTA 
CCTTTCaTGC 
GAGAA6CTT6 
CATTCXaGTAA 
TCAGCTTTGA 
AATTTCAAGG 
GACTTCTTCA 
GATGATGATG 
TGGAGATTTG 
AAGAAT6TTT 
GAGTTGCTCC 
AAAAGAAGAG 
CATCGCTTCT 
GAGCTAATTG 
AAAGGCTATG 
CCTGGT6AAA 
GATGTGAAGA 
TGCTTGGCAC 
GTTGGCTCTC 
GTTGGQAAGA 
ACCATCCTTA 



31 
I 

G06GGGCGGG 
ATGTGAGCAG 
GC3GGGGCGGG 
TQ6GGCO6CX9 
ATTCTTTGCT 
GAAAAAGAGA 
AAAGGCATTT 
TCGTGGTGGT 
CTGGACACGT 
TTATAATCAG 
TTAATGATAC 
CACGTTTGCA 
caaaagggtt 
CTGACCCAGA 
ACAGATACAT 
ACAGAGTTGA 
6AATACAGAT 
AGCACTACAA 
TGTTGCTAGA 
ACCTTTTCAC 
CCA6AGCAAA 
AAAATATCTA 
CAAAGGAAGC 



41 

I 

AGG6AAAAGA 
TTTTCOGAAA 
GAACATGTIGG 
ACCTCOGGAT 
CTCAGACTAC 
TCTACAGACT 
TAAATTATAC 
GGATGGTAAA 
GGTTGGTGAG 
AATCAACACA 
CAAAGACAAA 
CTCTCCAAAA 
AGTAGACAGA 
TCCCATGAAG 
GGGCAGAGGG 
T6ACATCTAT 
AGAGCA6ATT 
CATOGCAAAA 
GCAATTTAGC 
ATACCAAGAT 
CAGCCATGGA 
TTTGAATAGT 
TGACCTGGTT 



51 
I 

GGATTGAGGG 

CCCOGTCAGG 

CAGTCTCTCC 

GACCCX3GGCT 

GATATTCTCT 

TCAACACAT6 

CTGACATCAA 

AACGAAAG06 

CCTGACTCTA 

GATGGGGCCG 

AGAATGTTAG 

GTGTGTGGTT 

GAACCAGCTG 

AACACGTGTA 

GAAGAGAGTA 

GGGAACACTT 

G6CATTCTCA 

AGTTACCCAA 

TTTGATATAG 

TTTGATATGG 

GGTGTTTGTC 

GGTTTGACGA. 

ACAACTCAT6 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 



395 



wo 02/086443 

AATTC6GACA TAATTTTG6A GCAGAACAT6 ATCCG6ATGX3 TCTAQCAGAA TOTGCCCOQA 1440 

A1GA0GACCA GGGAGGGAAA TATGTCATGT ATCCCATAGC TQTOAGTGGC QATCACX5AGA 1500 

AOVATAJVGAT GTTTTCAAAC TGCAGTAAAC AATCAATCTA TAAOACCATT GAAAGTAAGG 1560 

CCCAGGA6TG TTTTCAAGAA CGCAGCAATA AAGTTTQTGG GAACTOGAGO GTGGATGAAG 1620 

GASAAGAGT6 TGATCCTGGC ATCATGTATC TGAACAAGGA CAGCTGCTGC AACAGOQACT 1680 

6CACGTTCAA OGAAGGTGTC CAGTGCA6TG ACAGGAACAG TCCTTGCTGT AAAAACTOTC 1740 

AGTTTGAGAC TGCCCAGAAG AAGTGCCAGG AGGOGATTAA TGCTACTTGC AAAGG06TGT IBOO 

CCTACTGCAC AGGTAATAGC AGTGAGTGCC CGCCTCCAGG AAATGCTGAA AATGACACTG 1860 

TTTGCTTGGA TCTTGGCAAG TGTAAGGATG GGAAATGCAT CCCTTTCTGC GAGAGGGAAC 1920 

AGCAGCTGGA GTCCTGT6CA T6TAATGAAA CTQACAACTC CT6CAAGGIG TGCTGCAGGG 1980 

ACCTTTCTGG COGCTGTGTG CCCTATGTCG ATGCIGAACA AAAGAACTTA TTTTTGAGGA 2040 

AAGQAAAGCC CTGTACAGTA GGATTTTGTG ACATGAATGG CAAATGTGAG AAAOGAGTAC 2100 

AGGATGTAAT TGAACX3ATTT TGGGATTTCA TTGACCAGCT GAGCATCRAT ACTTT TGGAA 2160 

AGTTTTTA6C AGACAACATC GTTGG6TCT6 TCCTG6TTTT CTCCTTGATA TTTTGGATTC 2220 

CTTTCAGCAT TCTTGTCCAT TGTGTGQATA AGAAATTGGA TAAACAGTAT GAATCTCTGT 22B0 

CTCTGTTTCA CCCCAGTAAC GTOGAAATOC TGAGCAGCAT GQATTCTGCA TC3GGTT05CA 2340 

TTATCAAACC CTTTCCTGOG CCCCAGACTC CAGGCCQCCT GCAGCCTGOC CCTGTGATCC 2400 

CTTOGGCGCC AGCAGCTCCA AAACTGGACC ACCAGAGAAT GQACACCATC CAGGAAGACC 2460 

CCAGCACAGA CTCCCATATG GAOGAGGATO GGTTTGAGAA GGACCCCTTC CCAAATAGCA 2520 

GCACAGCTGC CAAGTCATTT OAOGATCTCA GGGACCATCC GGTOGCCtta AOTQAAAAGG 2580 

CTGCCTCCTT TAAACT6CA0 OSTCAfiAATC GTGTTAACAO CAAAGAAAC3V GAGT6CTAAT 2640 

TTAGTTCTCA GCTCTTCTGA CTTAAGTGTG CAAAATATTT TTATAGATTT GACCTACAAA 2700 

TCAATCACRG CTTGTATTTT GTGAAGACTG GGAAGTGACT TAGCAGATGC TGQTCATGTO 2760 

TTTGAACTTC CTQCAGGTAA ACAGTTCTTG TGTGGTTTG6 CCCT TCTCCT TTTGAAAAGQ 2820 

TAAOGTGAAA GTGAATCTAC TTATTTTGAO GCTTTCA68T TTTAGTTTTT AAAA TATC TT 2880 

TTQACCTGTG GT6CAAAASC A6AAAATACA GCTGGATTGG GTTATGAATA TT TACGTTTT 2940 

TGTAAATTAA TCTTTTATAT TGATAACAGC ACTQACTAGG QAAATGATCA GTTTTTTTTT 3000 

ATACACTGTA ATGAACCGCT GAATATGAAG CATTTGGCAT TTATTTGTGA GAAAAGTGGA 3060 

ATAGTTTTTT TTTTTTTTTT TTTTTTTTGC CTTCAACTAA AAACAAAGGA GATAAATTTA 3120 

6TATACATTG TATCTAAATT GTGQOTCTAT TTCTAOTTAT TACCCAGAGT TTTTATGTAG 3180 

CAGGGAAAAT ATATATCTAA ATTTAQAAAT CATTTGGGTT AATATGGCTC TTCATAATTC 3240 

TAAGACTAAT GCTCAGAACC TAACCACTAC CTTACAGTGA GGGCTATACA TGGTAGCCftG 3300 

TTGAATTTAT GGAATCTACC AACTGTTTAG GGCCCTGATT TGCTGGGCAG TTTTTCTGTA 3360 

TTTTATAAGT ATCTTCATQT ATCCCTGTTA CTQATAGGGA TACATGTCTT AGAAAATTCA 3420 

CTATTGGCT6 GGAGTGGTGG CTCAT6CCTG TAATCOCAGC ACTTGGAGAG GCT6AGGTTG 3480 
CGCCACTACA CTGCAGCCTG GGIGACAGAG TGAGATCTGC CTC 

Seq ID JHQt 555 Protein sequence 
Protein Accession #: NP_003174.2 

1 11 21 31 41 51 

i I I I t I 

MRQSLLFLTS WPPVLAPRP PDDPGPOPHQ RLBRLDSLLS DYDILSI/SNI QQHSVRKKDL 60 

QTSTHVETLL TPSALKRHFK IjYLTSSTERF SQNFKWWD GKNBSBYTAK WQDFPTGHW 120 

GBPDSRVIAB XRSDOVIIRI NTDQAEVNXE PUVRFVHDTK DXRMLVYKSB DIKNVSRLQS 180 

PKVOSYIifCVD NEBLLPKOLV DREPPEELVB RVKRRADPOP MRNTCKLLW ADH RFYRY MG 240 

RGEBSTTTNY LIELIDRVDD lYRKTSMDNA GFKGYGIQIE QIRILKSPQB VKPGEKHTOM 300 

AKSYPNEEKD AWUVKMLLEQ PSPDIABEAS KVCLAHLFTY QDFDMGTLGL AYVGSPRANS 360 

KGGVCPKAYY SFVGKXNIYL NSGLTSTK2IY GKTILTKEAD LVTTHELGHN FGAEHDPDOL 420 

AECAPNEDQO GKYVMYPZAV SGDHBINKMF SNCSKQSIYK TIESKAQECF QERSHKVOGN 480 

SKVDB6EBC9 PGIMYUniDT CCNSDCTLKE GVQCSDRN8P CCKNCQFETA QKK CQEAI NA 540 

TCKGVSYCTO KSSBCPPPGN ABNDTVCLDL GKCKDGKCIP FCEREQQLES CAC3JBTDNSC 600 

KVCCRDLSGR CVPYVDABQK NLFLRKGKPC TV6PCDMNQK CEKRVQUVIB RFWDPIDQLS 660 

INTFGKFLAD NIVGSVLVPS LIFWIPPSIL VHCVDKKLDK QYBSLSLPHP SNVEMLSSMD 720 

SASVRIXKPF PAPQTPGRLQ PAPVIPSAPA APRUIEQSND TIQEDPSTDS HKDED6FEKD 780 
PFPNSSTAAK SFEDLTDHFV ARSEKAASFK K^RQNRVNSK BTBC 

seq ID NO: 556 DNA sequence 

Nucleic Acid Accession 9: NM_021832.1 

Coding sequence: 164.^2248 

1 11 21 31 41 51 

I I I 1 11 

TOGAGCCTGG CGGTAGAATC TTCCCAGTAG GCGGC6GG60 AGGAAAAGAG GATT6AGGG6 60 

CTAGQCCGGG CGGATCCCGT CCTCCCCOQA TGTGAGCAGT TTTCCGAAAC CCCGTCAGGC 120 

GAAGGCTGCC CAGAGAGGTG GACTCGGTAO OGGGGCOGGG AACATGAGGC AGTCTCTCCT 180 

ATTCCTGACC AGCGTGQTTC CTTTOGTGCT GGCJGCOGCJGA CCTCC6GATG ACCCGGGCTT 240 

CGGCCCCCAC CAGAGACTOG AGAAGCTTQA TTCTTTQCrC TCAOACXACXS ATATTCTCTC 300 

TTTATCTAAT ATCCA6CA0C ATTCGOTAAO AAAAAOAGAT CTACAGACTT CAACAOITOT 360 

AGAAACACTA CTAACTTTTT CAGCTTTGAA AAGGCATTTT AAATTATACC TGACATCAAQ 420 

TACTGAACQT TTTTCACAAA ATTTCAAGGT OGTGGTGGTG GATGGTAAAA ACXSAAAQCXSA 480 

GTACACTGTA AAATGGCAGG ACTTCTTCAC TGGACaVOGTG GTTG6TGAGC CTQACTCTAG 540 

GQTTCTAOOC OVGATAAGAG ATGAT6ATGT TATAATCAGA ATCAACACAO ATGGGGCCSQA 600 

ATATAACATA 6AGCCACTTT GGA6ATTTGT TAATGATACC AAAGACAAAA GAATGTTAGT 660 

TTATAAATCT GAAGATATCA AGAATGTTTC ACGTTTGCAG TCTCCAAAAG TGTGTGGTTA 720 

TTTAAAAGTG GATAATGAAG AGTTGCTCXX: AAAAGGGTTA OTAGACAGAG AACCACCTGA 780 

AGAGCTTGTT CATCX3AGTGA AAAGAAGAGC TGACCCAGAT CCCAT6AAGA ACACGTGTAA 840 

ATTATTG6T6 GTA6CAGATC ATC3SCTTCTA CAQATACATG G6CAGA6GGG AAGAGAGTAC 900 

AACTACAAAT TACTTAATAG A6CTAATTGA CAGAGTTGAT GACATCTATC GGAACACTTC 960 

ATGGGATAAT GCAG8TTTTA AA6GCTATGG AATACAGATA GAGCAGATTC GCATTCTCAA 1020 

GTCTCCACAA GAGGTAAAAC CTGGTQAAAA GCACTACAAC ATGGCAAAAA GTTACCCAAA 1080 

TGAAGAAAAG GATGCTTGGG ATGTGAAGAT GTTGCTAGAG CAATTTAGCT TTGATATAGC 1140 

T6AGQAAGCA TCTAAAGTTT GCTTGGCACA CCTTTTCACA TACCAAGATT TTGATATQGO 1200 

AACTCTT6GA TTAGCTTATG TTGGCTCTCC CAGAGCAAAC AGCCATGGAG 0TGTTT6TCC 1260 

AAAGGCTTAT TATAGCCCAG TTGGGAAGAA AAATATCTAT TTGAATAGTG 6TTTGACGAG 1320 

CACAAAGAAT TATQGTAAAA CCATCCTTAC AAAGGAAGCT GACXTTGOTTA CAACTCATGA 1380 

ATTGGGACAT AATTTTGGAG CAGAACAT6A TCCX^ATGGT CTAGCAGAAT GTGCCGCGAA 1440 
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TGAGGACCAG GGAGGGAAAT ATGTCATGTA TCCXaTAQCT GTOAOTGGCG ATCACGAGAA 
CAATAAGATG TTTTCAAACT GCM3TAAACA ATCRATCTAT AAGACCATTG AAAGTAAGGC 
CCAGGAGTGT TTTCAAGAAC GCAGCAATAA AGTTT0T6GG AACTOGAGGG TGGATGRAGG 
AGAAGAGTGT GATCCTGQCA TCATGTATCT GAACRACX3AC ACCTGCTGCA ACAGCGACTG 
CACGTTGAAG GAAGOTGTCC AGTGCAGT6A CAGGAACAGT CCTTGCTGTA AAAACTGTCA 
GTTTGAGACT GCCCAGAAGA AGTGCCAGGA GGOOATTAAT GCTACTTGCA AAGGC6TGTC 
CTACTGC3VCA GGTAATAGCA GTGRGTGCCC GCCTCChGGA AATGCTGAAG ATGACACTGT 
TTGCTTG6AT CTTGQC3UV0T GTAAGGATGG GAAATGCATC CCTTTCTGCG AGAQGGAACA 
GCAGCTGGAG TCCTGTGCAT GTAATGAAAC TGACAACTCC TGCAAGGTGT GCTGCAGGGA 
CCTTTCCXSGC CGCTGTQTGC CCTATGTCGA TGCTGAACAA AAGAACTTAT TTTTGAGGAA 
AGGAAAGCCC TQTACftGTAG GATTTTGTGA CATGAATGGC AAATGTQA6A AACGAGTACA 
GGAT6TAATT GAAOCSATTTT GGGATTTCAT TGACCAGCTG A GCATC RATA CTTTTGGAAA 
GTTTTTAGCA GACAACAT06 TT6GGTCTGT CXrrGGTTTTC TCCTTGATAT TTTGGATTCC 
TTTCAGCATT CTTCTCCATT GTGTGTAACG TCGAAATGCT QAGCAGCATG GATTCTGCAT 
CGGTTCGCAT TATCAAACCC TTTCCTGCGC CCCAGACTCC AGGCCGCCTG CAGCCTGCCC 
CTGT6ATCCC TTCGGCGCCA GCAGCTCCAA AACTGGACCA CCAGAGAATG GACACCATCX: 
AGGAAGACCC CaGCACAOAC TCACATATCQ ACGAGQATGG GTTTGAGAAG QACCCCTTCC 
CAAATAGCAG CACAGCTGCC AAGTC3VTTTG AGGATCTCAC GGACCATCCG GTCACCAGAA 
GTGAAAAGGC TGCCTCCTTT AAACTGCAGC GTCA6AATCG TGTTGA CAGC AAAGAAACAG 
AGTGCTAATT TAGTTCTCAO CTCTTCTGAC TTAAGTGTGC AAAATATTTT TATAGATTTG 
ACCTACAATC AATCACAGCT TATATTTTGT GAAGACTGGG AAGTGACTTA GCAGATGCTG 
6TCATGTGTT TGAACPTCCT GCAGGTAAAC AGTTCTTGTG TGGTTTGGCC CTTCTCXTTT 
TGAAAAOGTA AGGTQAAGGT GAATCTAGCT TATTTTGAGG CTTTCAGGTT TTAGTTTTTA 
AAATATCTTT TGACCTGTGG TGCAAAAGCA GAAAATACAG CTGGATTGGG TTATGAGTAT 
TTACGTTTTT GTAAATTAAT CTTTTATATT GATAACAGGC ACTGACTAGQ GAAATOATCA 
GTTTTTTTTT ATACACTGTA ATGAACCGCT GAATATGAAG CATTTGGCAT TTATTTQTQA 
GAAAAGTGGA ATAGTTTTTT Trrrm ' TI i TTTTTTTTGC CTTCAACTAA AAACAAAGGA. 
GATAAATTTA GTATACATTG TATCTAAATT GTGGGTCTAT TTCTAGTTAT TACCCAGAGT 
TTTTATGTAG C3W3GGAAAAT ATATATCTAA ATTTAGAAAT CATTTGGGTT AATATGGCTC 
TTCATAATTC TAAGACTAAT GCTCAGAACC TAACCACTAC CTTACAGTGA GQGCTATACA 
TGGTAGCCaG TTGAATTTAT GGAATCTACX: AACTGTTTAG GGCCCTQATT TGCTGGGCAG 
TTTTTCT6TA TTTTATAAGT ATCTTCATGT ATCCCTGTTA CTGATAGGGA TACATGTCTT 
AGAAAATTCA CTArrGGCTC GGAGTGGTGG CTCATGCCTG TAATCCCAGC ACTTGGAGAQ 
3421 GCTGAGGTrG CGCCACTACA CTCCAGCCTG GGTGACAGAG TGAGATCTGC CTC 

Seq ID NO: 557 Protein sequence 
Protein Aceession #i NP.068604.1 
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MRQSLLFLTS 
QTSTHVETIiL 
GEPDSRVLAH 
PKVOGYLKVD 
RGEESTTTNY 
AKSYPNEEKD 
HGGVCPKAYY 
AECAPNEDQG 



TCXON^CTO 
KVCCRDLSGR 
INTFGKFLAD 



11 
! 

WPPVLAPRP 
TPSALKRHFK 
IRDDDVIIRI 
NBELLPK6LV 
IiIELIDRVDD 
ANDVKMLIiEQ 
SPVGKKNIYL 
GKYVMYPIAV 
PGIMVU9KDT 
KSSECPPPGN 
CVPyVDAEQK 
NIVGSVLVFS 



21 
I 

PDDFGFGPEQ 
LYLTSSTERF 
NTDGAEYNIE 
DREPPEBLVH 
lyBNTSHDNA 
FSFDZAEEAS 
KSGLTSTKNY 



COfSDCTLKB 
AEDDTVCXiDL 
NLFLRXGKPC 
LIFWIPPSIL 



31 
I 

RLEKLDSLLS 
SQNFKVWVD 
PLWRFVNDTK 
RVKRRADPDP 
GFRSYGIQIE 
KVCIiAHLFTY 
GKTILTKEAD 
SNCSKQSIYK 
6VQCSDRNSF 
6KCKDGKCIP 
TVGPCDtVSRiK 
VHCV 
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1 

OYDILSLSKI 
GKNESBYTVK 
DKRMLVYK5E 
MKHTCKLLW 
QZRILKSFQB 
QDFDMGTLGIi 
LVTTHELGHN 
TIESKAQECF 
CCKHGQFETA 



CEKSVQDVIE 



Seq ID NO: 556 DNA sequence 

Nucleic Acid Accession #i NM_004994.1 

Coding sequence; 20.. 2143 
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QQHSVRKSint 
WQDPFT6HVV 
DIKNVSRLQS 
ADHRFYRYMG 
VXPOEKHyNM 
AYVGSPRAN8 
PGAEHDPDGL 
QBRSNKVCCai 
QKKCQEAINA 
CACDETDNSC 
RFHDFIDQLS 



1 
1 

AGACACCTGT 
G6GCTQCTGC 
CCTGAGAACC 
CACTCXX3GTG 
CCAGAAGCAA 
6CGAACCCCA 
CAAGTGGCAC 
GGCGGTGATT 
CACCTTCACT 

gcacggagac 
tg6ccxx;ggc 
gggcx3tcgtg 

CATCTTCGAG 
CTGGTGCAGT 
GAGACTCTAC 
CCAAGGCCAA 
CGCXyVCCACC 
CTCGACGGTG 
GGGTAAGGAG 
TACCACCTOG 
TTTGTTCCTC 
GC0GGAGGC6 
OGACGTGAAT 
AACCACCACC 
TOTCCACCCC 
AGGTCCCCCC 
TGCCTGCAAC 
CAAGGATGGG 



11 

1 

GCCCTCACCA 
TTT6CTGCCC 
AATCTCACCX3 
GCAGAGATGC 
CTGTCCCTGC 
C366TGCX3GGG 
CACCACAACA 
GACGAOGCCT 
C6CQTGTACA 
GGGTATCCCT 
ATTCAGGGAG 
GTTCCAACTC 
GGCCX3CTCCT 
ACCAOGGCCA 
ACCCX3GGACG 
TCCTACTCCG 
GCCAACTACG 
ATGGGGG6CA 
TACTGGACCT 
AACTTTGACA 
GT6GCGGCGC 
CTCATGTACC 
GGCATCC36GC 
ACACCGCAOC 
TCA8A60GCC 
ACTGCTGGCC 
GTGAACATCT 
AAGTACTGGC 



21 
I 

TGAGCCTCTG 
CCAGACAG06 
ACAGGCAGCT 
GTGGAGAGTC 
CCGAGACOGG 
TCCCAGACCT 
TCACCTATTG 
TTGCCOGOGC 
GCXX3GGACGC 
TCGACGGGAA 
ACGCX:CATTT 
GGTTT6GAAA 
ACTCTGCCTG 
ACTAGGACAC 
GCAATGCTGA 
CCTGCACCAC 
ACCGGGACAA 
ACTCGGCGGG 
GTACCAGGGA 
GCGACAA6AA 
ATGAGTTCGG 
CTATGTACCG 
ACCTCTATQG 
CCAGGGCTCC 
CCACA6CTG6 
CTTCTACGGC 
TCGACGCCAT 
GATTCrCTGA 



31 
I 

GCAGCCCCTG 
CCAGTCCACC 
GGCAGAGGAA 
GAAATCTCTG 
TGA6CTGGAT 
GGGCAGATTC 
GATCCAAAAC 
CTTCGCACTG 
AGACATCGTC 
GGACGGGCTC 
CGACGAT6AC 
CGCAGATGGC 
CACCACXXSAC 
CGACGACGGG 
TGGGAAACCC 
GGACG6TCGC 
GCTCTTCGGC 
GGAGCTGTGC 
GGGCC6GGGA 
GTGGGGCTTC 
CCACGCGCTG 
CTTCACTGAG 
TCCTCGCCCX 
GCOQAGGOTC 
CCCCACAGGT 
CACTACTGTG 
CGCGGAGATT 
GGGCAGGGGG 



41 

1 

GTCXrrGGTGC 
CTTGTGCTCT 
TACCTGTACC 
GGGCCTGGGC 
AGC6CCACGC 
CAAACCTTTG 
TACTCGGAAG 
TGGAG CGCGG 
KSOCMSmG 
CTG6CACACG 
GAGTTGTGGT 
GCGGCCTGCC 
GGTCGCTC06 
TTTGGCTTCT 
TGCCAGTTTC 
TCCGACX5GCT 
TTCTGCCCGA 
GTCTTCCCCT 
GATGQ0CX3CC 
TGCCCGGACC 
GGCTTAGATC 
GGGCCCCCCT 
GAACCXGAQC 
TGCCCCACCG 
CCCCCCTCAG 
CCTTTGAGTC 
GGGAACCAGC 
AGCCiGGCOGC 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 



51 

1 

TCCTGGTGCT 
TCCCTGGAGA 
GCTATGGTTA 
TGCTGCTTCT 
TGAAGGCCAT 
AG6GCGACCT 
ACTTGCCGOG 

tgacgcx:gct 

GTGTG60G6A 
OCTTTCCTCC 
CCCTGGGCAA 
ACTTCCCCTT 
ACG6CTT6CC 
QCCCCAGCaA 
CATTCATCTT 
ACCGCTGGTG 
CCCGAGCTGA 
TCACTTTCCT 
TCT6GTGCGC 
AAGGATACAG 
ATTCCTCAGT 

tgcataagga 
cacggcctcc 
gaccccccac 
ctggcccx:ac 
cggtggacga 

TGTATTTCTT 
AGGGCCCCTT 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
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CCTTATOSOC GACAAGTGGC CCGCGCTGCC CCGCAAGCTG GACTCGGTCT TTGAGGAGCC 1740 

GCTCTCCAAG AAGCTTTTCT TCTTCTCTQG 6CX3CCAGGTG TGGGTGTACA CAGGOGCGTC 1800 

GGTGCTGGGC CCGAGGOGTC TGGACAAOCT GGGCCTQGGA GCCGACOTQG CCCAGGTGAC 18 SO 

COGGGCCCTC CXSGftGTGGCA GGGGGAAOAT GCTGCTOTTC A6GGGGCGGC GCCTCTG6A6 1920 

GTTCGAOJTG AAGGOGCAGA TGGTGGATCC CCGGAGCGCC AOCQAGGTGO ftCCOQATGTT 1980 

CCCCOGGGTG CCTTTG6ACA CX3CACGACGT CTTCCAGTAC GGAOAGAAAG CCTATTTCTG 2040 

CCAGGACCGC TTCTACTGGC GCGTGAGTTC CCGGAGTGAG TTGAACCAGG TGGACCAA6T 2100 

QGGCTACGTG ACCTATOACA TCCTGCaVGTG CCCTGAGGAC TAGGGCTCCC GTCCTGCTTT 2160 

GCAGTGCCAT GTAAATCCCC ACTGGGACCA ACCCXGGG6A AGGAGCCAGT TTGOCGGATA 2220 

CAAACTG6TA TTCTOTTCTG GAQGAAAOGG AGC3AGTG6AG GTQGOCTGGO CXXTCTCTTC 2280 
TCACCTTTGT TTTTTOTTGO AGTGTTTCTA ATAAACTTG6 ATTCTCTftAC CTTT 

Seq ID NO: 559 Protein, sequence 
Protein Accession #: NP_004985.1 

1 II 21 31 41 51 

I i ] I I i 

MSLWQPLVLV ZjLVLGCCFAA PRQRQSTLVL FPGDLRTNLT DRQLAEEYLY RYGYTRVAEM 60 

RGBSKSU3PA LUiLQKQItSL PETGEIiDSAT LKAMRTPROG VPDLGRFQTF BGDLKWHHHN 120 

ZTYWIQUYSE DI.FRAVIDDA PARAFALMSA VTFLTFTRVY SRDADIVZQF GVASHGDGYP 160 

FDGKbGLLAH APPPGPGIQG DAHFDDDBLW SLGKGWVPT RFQNADGAAC BFPFZPEGRS 240 

YSACTTDGRS DGLPWCSTTA NYDTDDRFGF CPSBRLYTRD OJADGKPCQP PFIFQGQSYS 300 

ACTTDGRSDO YRWCATTANY DRDKLFQPCP TRADSTVMGG NSAGELCVPP FTFliGKEYST 360 

CTSBGRGDGR IiKCATTSNFD SDKKMGFCPD QGYSIiPX<VAA HEFGHALGLD HSSVFEALMY 420 

PHYRFTBapP LHKDOVNGZR HLYGPRPEPB PRPPTTTTPQ PTAPPTVCPT GPPTVEPSBR 480 

PTAGPT6PPS AGPTGPPTA6 PSTATTVPLS PVDDACNVIII FDAIAEKBIQ LYLFKDGKXW 540 

RFSBGRGSRP QGPFLIADKH PALPRKLDSV FEEPLSKKLF FFSGRQVWVY TOASVUSPRR 600 

AQVTGALRSG RGKMLLFS6R RLWRFDVKAQ HVDPRSASBV DRMFP6VPLD 660 
THDVFQYREK AYFCQDRFYW RVSSRSBUIQ VDQVGYVTYD ILQCPED 

Seq ID NOt S60 DBZA sequence 

Nucleic Acid Accession #: MM_000213.1 

coding sequence: 12 7.. 53 85 

1 U 21 31 41 51 

I I I I I I 

OGCCOGCGCG CTGCAGCCCC ATCTCCTAGC GGCAGCCCAG GOGCGGAGOG AGCGAGTCCG 60 

CCCXX3AGGTA GGTCCAGGAC GGGCGCACAG CAGCAGCCGA GGCTGGCCGG GAGAGGGAGG 120 

AA6AGGATGG CAGG6CCAG6 CCCCAGCCCA TGC3GCCAGGC T6CTCCTG6C AGCCTTGATC 180 

A6C6TCAGCC TCTCTGGGAC CTTGGCAAAC GX3CTGC3kA6A AGGGCCCAGT GAAGAGCIGC 240 

ACGGAGTGTG TCCGTGTGGA TAAGGACTGC 6CX:TACTGCA CAI3ACX3AGAT GTTCAGGGAC 300 

OGGCGCTGCA ACACCCAG6C GGA6CTGCTG GCCX3CGQGCT GCCAQCXSGOA GAGCATOGTG 360 

GTCATGGAGA GCAGCTTCCA AATCACAGAG GAGACCCAGA TTGACACCAC CCTGCGGCGC 420 

AGCCAGATGT CCCCCCAAGG CCTGOGGGTC CGTCTGOGGC CCeGTGAGGA GCGGCATTTT 480 

GAGCTGGAGG TGTTTGAGCC ACPGGAGA6C CX»GTGQACC IGTACATCCT CATCGACTTC 540 

TCCAACTCCA TGTCCGATGA TCTGGACAAC CTCAAGAA6A TOGGGCAGAA CCT60CTCG0 600 

6TCCTGAGCC AGCTCACCAG CGACTACACT ATTGGATTTG GCAAGTTTGT GGACAAAGTC 660 

AGGGTCCCX3C AGACX3GACAT GAGGCCTGAG AAGCT6AAGG AGCCCTGGCC CAACAGTGAC 720 

CCCOCCTTCT CCTTCAAGAA C6TCATCAGC CTGACAQAA6 ATGTGGATGA GTTCCGGAAT 780 

AAACTGCAGG GAQAGCGGAT CTCAGGCAAC CTGGATGCTC CTQAGGG06G CTTCGATGCC 840 

ATCCTGCAGA CAGCTGTGTG CACGAGGGAC ATTGGCTGGC GCCCGGACAG CACCCACCTG 900 

CTGGTCTTCT CCACCGAGTC AGCCTTCCAC TATGAGGCTG ATGGCX3CCAA CGTGCTGGCT 960 

GGCATCATGA GCCX3CAACGA TGAAGGGTGC CACCTGGACA CCACGGGCAC CTACACCCA6 1020 

TACAGGACAC AGGACTACCC GT0GGT6CCC ACCCTGGT6C 6CCTGCT06C CAA6CACAAC 1080 

ATCATCCCCA TCTTTGCTGT CACCAACTAC TCCTATAGCT ACTACGAGAA GCTTCACACC 1140 

TATTTCCCTG TCTCCTCACT G6GGGTGCTG CAGGAGGACT CGTCCAACAT CGTGGAGCTG 1200 

CTG6AQGA6G CCTTCAATC6 GATCC6CTCC AACCTGGACA TCOGGGCCCT A6ACA6CCCC 1260 

CX3AGGCCTTC GGACAGAGGT CAGCTCCAAG ATGTTCCAGA AGACGAGGAC TGGGTCCTTT 1320 

CACATCXiGGC GGG6GGAAGT GGGTATATAC CAGGTGCAGC TGCGGGCCCT TGA6CACGT6 1380 

GATGQGAOGC ACGTGTGCCA GCTGCOGGAG GACCAGAAGG GCAACATCCA TCTGAAACCT 1440 

TCCTTCTCCG ACGGCCTCAA QATGGACGCG GGCATCATCT GTGATGTGTG CACCTGCGAG 1500 

CTGCAAAAAG AGGT6CX3GTC AGCTCGCTGC AGCTTCAAOG GAGACTTCGT GT6CGGACA6 1560 

T6TGTOTGCA G0QAQGGCT6 GAOTGGCCAiO AOCTOCAACT 0CT0CACGQ6 CICTCTQAGT 1620 

GACATTCAGC CCTGCCTGCG GGAaGGGSAG GACAASCCOT GCTCCGGCOG TGGGGAOTGC 1680 

CAGTGCGGGC ACTGTGTGTG CTACGGCGAA GGCOGCTACG AGGGTCAGTT CTGOGAGTAT 1740 

GACAACTTCC AGTGTCCCOG CACTTCCGGG TTCCTCTGCA ATGACOGAGG ACGCTGCTCC 1800 

ATGGGCCAGT GTGTGTGTGA GCCTGGTTGG ACAGGCCCAA GCTGTOACTG TCCCCTCAGC 1860 

AATGCCACCT GCA7CXSACA6 CAATGGG66C ATCT6TAAT0 GA0GT86CCA CT8TGAGT0T 1920 

GGC0GCT6CC ACTOCCACCA GCAGTGGCTC TACA06QACA 0CATCT6C6A GATCAACTAC 1980 

TOOGCGATCC ACCCGGGCCT CTGCX3AGGAC CTACGCTCCT GCGTGCAGTO CCA6QCGTGG 2040 

GGCACCG6CG AGAA6AAGGG • GCGCACGTGT GAGGAATGCA ACTTCAAGGT CAA6ATGGTG 2100 

GAOQAOCTTA AGAGAGCCGA GGAGGTG6T6 GTGCGCTGCT CCTTC06GGA 06AGGATGAC 2160 

QACTGCACCT ACAGGTACAC CATGGAAGGT GACGGCGCCC CTG6GCCCAA CAGCACTGTC 2220 

CTGGTGCACA AGAAGAA6GA CTGCOCTCOG GGCTCCTTCT 66T6GCTCAT CCCCCTGCTC 2280 

CTCCTCCTCC TGCC3QCTCCT GGCCCTGCTA CTGCTGCTAT GCTGGAAGTA CTGTGCCTGC 2340 

TGCAAGGCCT GCCTGGCACT TCTCCCGTGC TGCAACCGAG GTCACATGGT GGGCTTTAAG 2400 

GAAGACCACT ACATGCTGCO GGAGAACCTG ATGGCCTCTG ACCACTTGGA CACX3CCCAT6 2460 

CTG0GCAGC6 GQAACCTCAA GOGCCSTQAC GTGGTCGGCT 6QAAGGTCAC CAACAACATG 2520 

CAGCGGCCTG GCTTTGCCAC TCATGCOGCC AGCATCAACC CCACAGAGCT GGTOCXX^TAC 2580 

GGGCTGTCCT TGCGCCTGGC CCGCCTTTGC ACCGAGAACC TGCTGAAGCC TGACACTOGG 2640 

GAGTGOGCCC AGCTGC6CCA GGAGGTGGAG GAGAACCTQA AOGAGGTCTA CAGGCAGATC 2700 

TCCGGTGTAC ACAA6CTCCA GCAGACCAA6 TTCCGGCAGC AGCCCAATGC CXSGGAAAAAG 2760 

CAAOACCACA CCATTGTGGA CACA6TGCIG ATGGCGCCX:C GCTGGGCCAA GCCGGCCCIG 2820 

CTOAAGCTTA CAGAGAAGCA G6TG6AACA6 AG6GCCTTCC ACGACCTCAA G6TGGCCCCC 2880 

GGCTACTACA CCCTCACTGC AGACX»GGAC GCCCGGGGCA TGGTGGAGTT CCAGGAGGGC 2940 

GTQGAQCTGG TGGACX3TACG GGTGCCCCTC TTTATCCGGC CTGAGGATGA C6ACGAGAAG 3000 

CAGCTGCT6G TGGAQOCCAT CGACGTGCCC GCAGGCACTG CCACGCTOQG CCGCQGCCTG 3060 
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wo 02/086443 

GTAAACATCA CCATCATCAA GQAGCAAGCC AGAGACGrGG TGTCCTTTGA GCAGCCTGAG 3120 

TTCTCX3GTCA 6CCGCGGGGA 0CAG0T6GCC CGCATCCCTG TCATCCGGCG TGTCCTGQAC 3180 

GGOGGGAAOT CCCAGOTCTC CTACCS3CACA CAGGATGGCA CCGCGCAGGG CAACCGGQAC 3240 

TACATCCCCS TGGAGGQTGA QCTGCT6TTC CA6CCTGGGG AGGCCTGGAA AGAGCTGCA6 3300 

GTGAAGCTCC TGGAGCTGCA AGAAGTTGAC TCCCTCCTGC GGGGCCX3CCA GGTCGGCCGT 3360 

TTCCACGTCC AGCTCAGCAA CCCTAAGTTT GGGGCCCACX: TGGGCCAGCC CCACTCCACC 3420 

ACCATCATOV TCSUSGOACCC ASATGAACTG GACCGGAGCT TCACGAGTCA QATGTTQTCA 3480 

TCACAOCCAC CCCCTCA05G 03ACCTGG6C GCCCOGCAGA ACCCCAATGC TAAGGCCGCT 3540 

GGGTCCAGGA AGATCCATTT CAACTGGCTG CCXCCTTCTG GCAAGCCAAT GGGQTACAGO 3600 

6TAAAGTACT GGATTCAGGG TGACTCCGAA TCCQAAGCCC ACCTGCTCGA CAGCAAGQTO 3660 

CCCTCAGTGG AGCTCAGCAA CCTGTACCCG TATTGCGACT ATOAGATGAA GGTGTGCGCC 3720 

TAGGGGGCTC AGGGCGAGGG ACCCTACAGC TCCCTGGTGT CCTGCCGCAC CCACCAGGAA 3780 

GTGCCCAGOG AGCCAGGGCG TCTGGCCTTC AATGTCGTCT CCTCCAOGGT GACCCAGCTG 3840 

AGCTGGGCTG AGCGGGCTGA 6ACCAA06GT GAGATCACA6 CCTAOGAGGT CTGCTATG6C 3900 

CTGGTCAACG ATGACAACCG ACCTATTGGG CCCATGAAGA AAGTGCTGGT TOACAACCCT 3960 

AAGAACCGGA TGCTGCTTAT TGAGAACCTT CGGGAGTCCC AGCCCTACCG CTACA0GGT6 4020 

AAGGCGOGCA ACGGGGCCGG CTGQGGGCCT GAGCGGGAGG CCATCATCAA CCTGGCCACC 4080 

CAGCCCAAGA GGCCCATGTC CATCCCCATC ATCCCTGACA TCCCTATCGT GGACGCCCAG 4140 

AGCGGGGAGG ACTACGACAG CTTCCTTATG TACAGGGAT6 AG6TTCTAC6 CTCTCCATOQ 4200 

GGCAGCCAGA GGCCCAGCGT CTCCGATGAC ACTGAGCAOC TGGTQAATGG CCGGATGGAC 4260 

TTTGCCTTCC CGGQCAGCAC CAACTCCCTG CACAGGATGA CCACGACCAG TGCTGCTGCC 4320 

TATGGCACCC ACCTGAGCCC ACACGTGCCC CACCGCGTGC TAAGCACATC CTCCACCCTC 4380 

ACACX3GGACT ACAACTCACT GACCCGCTCA GAACACTCAC ACTCQACCAC ACTGCC6AGG 4440 

GACTACTCCA CCCTCACCTC OGTCTCCTCC CACGACTCTC GCCTGACT6C TGGTGTGCCC 4500 

GACACGCCCA CCCGCCTGGT GTTCTCTGCC CTGGGGCCCA CATCTCTCAO AGTGAGCTG6 4560 

CAGGAGCCGC GGTGCGAGCG 6CCGCTGCAG GGCTACA6TG TGGAGTACCA GCTGCTGAAC 4620 

GGCGGTGAGC TGCATCGGCT CAACATCCCC AACCCTGCCC AGACCTOGGT GGTGGTGGAA 4680 

GACCTCCTGC CCAACCACTC CTACGTGTTC CGOGTGCGGG CCCAGAGCCA GGAAGGCTG6 4740 

GGCCGAGAGC GTGAGQOTGT CATCACCATT GAATCCCAGG TGCACCCGCA GAGO CCAC TG 4800 

TGTCCCCTGC CAG6CTC0GC CTTCACTTTG AGCACTCCCA GTGCCGCAGG CCGGCTGGTG 4860 

TTCACTGCCC TGAGCCCAGA CTCGCTGCAG CTGAGCTGGG AGOGGCCAOG GAOGCCCAAT 4920 

GOGGATATCO TCGGCTACCT GGTGACCTGT GAGATGGCCC AAGGAGQAOG 6CCAGCCACC 4980 

GCATTCCGGG TGGATGGAGA CAGCCCCGAG AGCCGGCTGA CCGTGCCGGG CCTCAGCGAQ 5040 

AACGTGCCCT ACAAGTTCAA GGTGCAGGCC AGGACCACTG AGG6CTTCG6 GCCAGAGC6C 5100 

GAOGOCATCA TCACCATAGA 6TCCCAGGAT 66AGGA0CCT TCCCGCAGCT GGGCAGCCX3T 5160 

GCOQGGCTCT TCCAGCACCC GCTGCAAAGC GAGTACAGCA QCATCACCAC CACCCACACC 5220 

AGC6CCACCG AGOCCTTCCT AGT6GATGGG CCGACCCTGG GGGCCCAGCA CCTGGAGGCA 5280 

GGCGGCTCCC TCACCCGGCA TGTGACCCAG GAGTTTGTGA QCOGGACACT GACCACCAGC 5340 

GOAAOCCTTA GCACCCACAT GGACCAACAG TTCTTCCAAA CTT6ACCGCA CCCTGCOCCA 5400 

CCCCOGCCAT GTOCCACTAO GCGTCCTCCC GACTCCTCTC COQQAOCCTC CTCAG CTAC T 5460 

CCATCCTTGC ACCCCTGGGG GCCCAGCCCA CCOGCATGCA CA6AGCAGG6 GCTAGGrGTC 5520 

TCCTGGGAGG CATGAAGGGG GCAAGGTCGQ TCCTCT8TG6 GGCCAAACCT ATT TGTAAC C 5580 

AAAGAGCTGG GAGCAOCACA AGGACGCAaC CTTTGTTCTG CACTTAATAA ATGQTTTTOC 5640 
TACTG 

Seq ID NOt 561 Protein seqiiezice 
Protein Accession #: NP_000204.1 

1 11 21 31 41 51 

1 i I I' I I 

MA6PRPSPHA RLLIiAALISV SIiSSTLANRC KKAFVKSCTE CVRVDKDCAY CTDENFRDRR 60 

CNTQAEUAA GCQRESIWM ESSFQITEET QIDTTLRRSQ r4SPQGIiRVRL RPGEERHPEL 120 

EVFEPLESPV DLYILMDFSN SMSDDIiDNLK KMGQNLARVL SQLTSDYTIG PGKFVDKVSV 180* 

PQTDMRPEKL KEPWPNSDPP PSPKNVISLT EDVDEFRNKIi QGERIS<aiIiD APEGQFDAIL 240 

QTAVCTRDZG WRPDSTHLLV FSTSSAFHYE ADGANVLAGI MSRNDERCHL DTTGTYTQYR 300 

TQDYPSVPTL VRLLAKHNIl PIFAVTOYSY SYYEKLHTYF PVSSLGVLQE DSSNIVELLE 360 

EAPNRIRSNL DIRALDSPRQ LRTBVTSKMP QKTRTGSFHI RRGEVGIYQV QLRAIiEHVDG 420 

THVCQLPEDQ KGNIHLKPSF SDGLKMDAGI ICDVCTCELQ KEVRSARCSF NGDFVCQQCV 4B0 

CSBGWSGQTC NCSTGSLSDI QPCLREGEDK PCSGRGECQC GHCVCYGEGR YEGQFCEYDN 540 

FQCPRTS6FL CNDRGRCSMG QCVCEPGWTG PSCDCPLSNA TCIDSNGGIC NGRGHCECGR 600 

CHCHQQSLYT DTICEINYSA IHPGLCEDLR SCVQOQAWQT QEKKGRTCEE OJPKVKMVDE 660 

liKRAEEVWR CSFRDEDDDC TYSYTMBGDG APGPNSTVLV HKKKDCPPGS FWWLIPLLLL 720 

LLPLIALLLL LCWKYCACCK ACLALLPCCS RGHNVGFKED aYMLRENLNA SDHtDTPMLR 780 

SGNLKGRDW RWKVTNNMQR PGFATHAASI NPTELVPYGL SLRLARLCTE NLIiKPDTRBC 840 

AQLRQEVEEN LNEVYRQISG VHKLQQTKFR QQPNAGKKQD HTIVDTVLMA PRSAKPALLK 900 

LTEKQVEQRA PHDLKVAPGY YTLTADQDAR OWEFQEGVE LVDVRVPLFI RPBDDDEKQL 960 

LVEAIDVPAG TATLORRLVN ITIIKEQAHD WSFEQPBFS VSRGDQVARI PVIRRVLDGG 1020 

KSQVSYRTQD GTAQGNRDYI PVGGBLIiFQP GEAWKEXiQVK IiIiELQEVDSL LR6RQVRRFH 1080 

VQLSZ^KFGA HLGQPHSTTI ZXRDPDELDR SFTSQMLSSQ PPPHGDLGAP QNFNAKAAGS 1140 

RKIHFNWLPP SQKPM6YRVK YWIQGDSBSE AHLLDSKVPS VELTNLYPYC DYEMKVCAYG 1200 

AQGEGPYSSL VSCRTHQEVP SEPGRLAFNV VSSTVTQLSW AEPAETNGEI TAYEVCYGLV 1260 

NDDNRPIGPM KKVLVDNPKN RHLLIENLRB SQPYRYTVKA RNGAGWGPER EAIINLATQP 1320 

KRPMSZPXIF DIPIVDAQSG EDYD8FLKYS DDVIiRSPSQS QRPSVSDDTE HLVNGSMDFA 1380 

FPGSTNSLHR HTTTSAAAYG THLSPBVPHR VtiSTSSTLTR DYNSLTRSEH 8HSTTLPRDY 1440 

STLTSVSSHD SRLTAGVPDT PTRLVFSALG PTSLRVSWQE PRCERPLQGY SVEYQUiNGG 1500 

ELHRUJIPNP AQTSWVEDL LPNHSYVFRV RAQSQEGWOl ERBGVITIES QVHPQSPIiCP 1560 

LPGSAFTLST PSAPGPLVFT ALSFOSLQLS WERPRRPNGD IVGYIiVTCEK AQGGGPATAP 1620 

RVDQD8PESR LTVPGLSENV PYKFKVQART TEGFGPERBG IITIESQDGG PFPQLGSRAG 1680 

LFQHPLQSEY 8SITTTHTSA TEPFLVOCOT LGAQHIiBAaG 8LTBUVTQBF VSRTLTTSGT 1740 
LSTHMDQQFF QT 

Seq ID NO: 562 DliA sequence 

Nucleic Acid Accession ft: NM_013332.1 

Coding sequence: 1. .63 

1 11 21 31 41 51 

I I I I I I 
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10 
15 
20 
25 
30 
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GCACGAGGGC GCTTTTGTCT CCGGTGAGTT TT6TG606G6 
AGTAACCXSAC TTTCCTCCX3G ACTCCTGCAC QACCTGCTCC 
OGGCTGTTCC CCCGGAGGGT CCAGAGGCCT TTCAGAAGGA 
GCAGA6QA6T AGQOTCCTTT CAGCCAT6AA GCATGTGTTG 
GGTACTQACC CTACTCTCCR TCTTOQTTAO AGTGATGGAG 
GAOCCCATCG CCTGGGACCT CCTG6ACCAC CA6AA6CCAA 
CAAGOGCCTT CCAGACCATC CATCCAGAAG CATGTGATAA 
ATATTTTGGA ACACTGACCT AGACATGTCC AGATGGGAGT 
TOAGCACCGT TGTAACCAGA GAACTATTAC TAGGCCTTGA 
CTCATTGCCT GGGCAAGGCC TGTTTftGGCC G6TT60SGT0 
CACTTTGGGA GGCTGAGGTG GGTGGATCAC CTGAGGTCAG 
CAACATGGCG AAACCCCATC TCTTACTAAAA ATACAAAAGT 
GGCCTGTAAT CCCAGTTCCT TGGGA6GCTG AGGCX3GGAQA 
GAGGTTGCAG TGAACCGAQA TCGCACTGCT GTACCCAGCC 
CATCrCAAAA AAAAAAAOAA AAGAAAAAGC CTGT TTAAT G 
TTATGGCTAT GAGATAGGTT GATCTCGCCC TTACCC06G6 
TOCTCAGCAO TATGGCTCTG ACATCTCTTA GATGTCCCAA 
TCATATTTTC AACCCTACTT CCTAAACATC TGTCTGGGGT 
TATGCTCAAT TATTTGGTGT TGAGCCTCTC TTCCACAAGA 
CAOTTQAAGA GQTT G TGT G G GTG6GCT6TT GGGA0T6AG6 
TTCTCATTTT ACATTTTAAA GTOSTTCCTC CAACAXAGTG 
0GT6SGA7GC CAAAGCCTOC TCAAOTTATG GACATTGTOO 
T TTTTTCTAA CTAATAAAGT GGAATATATA TTTGAAAAAA 



Seq ZD NO: 563 Protein sequence 
Protein Accession #t IIP_037464.1 

11 



PCT/US02/12476 



AAGCTTCTGC 
TACAGCCGGC 
GAAGGCAGCT 
AACCTCTACC 
TCOCTAOAAG 
CTAGCCAACA 
GACCTCCTTC 
CCCATTCCTA 
AGAACCTGTC 
OCTCATGCCT 
QAQTTOGAGA 
TAGCTG66TG 
ATTGCTTGAA 
TGGGCCACAG 
CACAGGTGTG 
GTCTGGTGTA 
CTTCAGCTOT 
TCCTTTAGTC 
QCTCCTCCAT 
ATGQA8TGTT 
TQTATTGGTC 
CCACCATGTG 
AAAAAAAAAA 



GCTGGTGCTT 
GATCCACTCC 
CTGTTTCTCT 
TGTTAGGTGT 
GCTTACTAGA 
CAOAGCOCAC 
CATACTGGCC 
GCAGACAAGC 
TAACTGGATO 
GTAATCCTAG 
CCAGCCTCGC 
TGGTGGCAGA 
CCCGGGGACG 
TGCAASACTC 
A6T0QATTGC 
TGCTGTGCTT 
TGGGAGATGG 
TTGAATQTCT 
OTTTGGATAO 
CAOTGCCCAT 
TQAAGGGGOT 
GCTTAAATOA 
AA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
640 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 



51 



21 31 41 

I I i I I I 

MKHVXHLYLL GWLTLLSIP VRVMESLEQL LESPSPGTSH TTRSQLANTB PTROIiPDBPS 
RSM 



60 



35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



Seq ID NO] 564 DNA sequence 
nucleic Acid Accession #: MH_023915. 
Coding sequence: 250.. 1326 



1 
I 

GGCACGAGGG 
TCAAABCTTA 
GT6AATGGAC 
CX!CACOCCTC 
AACTGAAGAA 
CAAQAOAGTC 
AATQAATTTG 
TTGCTGAATQ 
TTCTATCTCA 
ATAGTCCATG 
TCAGTTTTGT 
GATC6CTATC 
A06AAGGTTT 
ATCCTGACAA 
CCTTTGGGGG 
GTGCTGGTGA 
AGGCAATTCA 
GTGOCTGTGT 
AGTCACTTAG 
ATTACACTTT 
TGTAGGTCAT 
ATCAGATCAC 
GTGTAGGCCT 
TTCATTATCC 



11 

I 

TTTCGTTTTC 
TTCTTAATTA 
AGCCAGCC3VC 
AATCGTCCCC 
TGGGGTTCAA 
ACftATTCAGG 
ACACAATTQT 
GTTTAGCAaT 
AAAACATAGT 
ATGCAGGATT 
TTTATGCAAA 
TGAAGGTGGT 
TATCTGTTTO 
ATGGTCAGCC 
TCAAATGGCa 
TTCTGATCGG 
TAAGTCAGTC 
TTTTTACCTG 
ACAGGCTTTT 
TCTTGTCTGC 
TTTCAAGAAG 
TGCAAAGTGT 
TTTATTGTTT 
TTAAAAAAAA 



21 



ATGCTTTACC 



CACAATCSUUl 
AAGTGTTTCC 
CTTGACGCTT 
CAACAGGAGC 
CTTGCGGGTG 
GTG6ATCTTC 
GGTTGCAGAC 
TGGACCTTGG 
CATGTATACT 
CAAGCCATTT 
TGTTTQGGTG 
AACAGAGGAC 
TAOGGCAGTC 
ATG7TACATA 
AAGCCX3AAAG 
CTTTCTACCA 
AGATGAATCT 
GTGTAATGTT 
GCTGTTCAAA 
GAGAAGATOG 
GTTGGAATGG 
AA 



31 

I 

AGAAAATCCA 
ACCTGTTTCA 
GAAATCAAAC 
TGACACGCAT 
GCAAAATTAC 
6ACGGGCCAG 
CTTTATCTCA 
TTCCACATTA 
CTCATAATGA 
TACTTCAAGT 
TCX31TCGTGT 
GGGGACTCTC 
ATCATGGCTG 
AATATCCATG 
ACCTATGTGA 
GCXATATCCA 
CGAAAACATA 
TATCACTTCT 
GCA CAAAA AA 
TGCCTGGATC 
AAATCAAATA 
GAAGTTOGCA 
ATATQTACAA 



Seq ID NO: 565 Protein sequence 
Protein Accession #: NP_076404 

1 11 21 31 

I I I I 

H6FNLTLAKL PNNELHGQES HNS6NRSDGP GKNTTLBKEF 
GLAVWIPFHI HNKTSFIPYL KNIWADLIM TLTFPPRIVH 
FYANMVTSIV FLGLISIDRY LKWKPFGDS RMYSITPTKV 
NGQPTEONIH OCSKLKSPLO VKHBTAVTYV NSCLFVAVLV 
ISQSSRKRKH KQSIRVWAV FFTCFIiPYHL CRZPFTF8BL 
FLSACNVCLD FIIYPFMCRS FSRRLFKKSN IRTRSBSIRS 

Seq ID NO: 566 DNA sequence 

Nucleic Acid Accession #: NM_00S365.1 

Coding sequence > 1 . . 948 



41 
I 

CTTCCCTGCC 
ACTTGAAGAC 
CA66AATAAC 
CTTTGCTTAC 
CAAATAACGA 
GAAAGAACAC 
TTATATTTGT 
GGAATAAAAC 
CGCTQACATT 
TTATTCTCTG 
TCCTTGGGCT 
GGATGTACAG 
TTTTGTCTTT 
ACTGCTCAAA 
ACAGCTGCTT 
GGTACATCCA 
ACCAGAGCAT 
GCAQAATTCC 
TCCTATATTA 
CAATAATTTA 
TCAGAACCAQ 
TATA^ATGA 
AGTGTAAATA 



41 
I 

D7IVLPVLYL 
DAGFGPWYFK 
LSVCVWVIMA 
ILIGCYIAIS 
DRLLDESAQK 
LQSVRRSBVR 



ATQTCTCTGG 
GAaSACTTGG 
TCCTCTGACA 
CCTCAGG6AG 
GAGGGCTCCA 
GAGTTCATQT 



11 

I 

AGCA6AGGA6 
GCCTGATGGG 
GCAAG6AG6A 
GOGCTTCCTC 
GCAGTCAAGA 
TCCAAGAA6C 



21 

I 

TCC36CACTGC 
TGCACAG6AA 
GGAQ6TGTCT 
CTCCATTTCC 
AGAGGAAGAG 
ACTGAAATT6 



31 

1 

AAGCCTQATG 
CCCACAGGCG 
GCT6CTGGGT 
GTCTACTACA 
CCAAGCTCCT 
AAGGTGGCTG 



41 

I 

AAGACCTTGA 
AQGAGOAGSA 
CATOUIGTCC 
CTTTATGGAG 
CGGTCGACCC 
AGTTGGTTCA 



51 
I 

GACCTTAGTT 
ACCGTATGAG 
CTATGCTQAA 
AGTGCATCAC 
GCTGCAC3GGC 
CACCCTTCAC 
GGCAAGCATC 
CAGCTTCATA 
TCCATTTCGA 
GAOATACACT 
GATAA6CATT 
CATAACCTTC 
GCCAAACATC 
ACTTAAAAGT 
GTTTGTGGCC 
CAAATCCAGC 
CAGGGTTGTT 
TTTTACTTTT 
CTGCAAAGAA 
CTTTTTGATG 
GAGTGAAAGC 
TTACACTQAT 
AATGTTTCTT 



51 
I 

IIFVASILU7 
PILCRYTSVL 
VIjSLPNIILT 
RYIHKSSRQF 
ZLYYCXSITL 
lYYDYTDV 



51 
I 

AOCCCAAGGA 
GACTACCTCC 
TCCCC3W5AGT 
CCAATTCGAT 
AGCTCAGCTG 
TTTCCTGCTC 



60 
120 
IBO 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 



60 
120 
180 
240 
300 



60 
120 
180 
240 
300 
360 



400 



5 

10 

15 

20 

25 

30 

35 

40 

45 

50 

55 

60 

65 

70 

75 

80 

85 



WO 02/086443 

CACAAATATC GAGTCAiWKSA GCCGGTCACA AAGGCAGAAA TOCTGGWSAG CGTCWCAAA 
AATTACAAGC GCTACTTTCX: TGTGATCTTC GGCAAAGCCT CCOAGTTCAT GCA6CTGATC 
TTTGGCACrG ATOTCAAGGA GGTGGACTix: GCOGGCCACT CCTACATCCT TGTCACTGCT 
CTTCGCCTCT C3GTGCGATAG CATGCTGGGT QATGGTCATA QCATGCCCAA GGCX:GCCCTC 
CTGATCATTG TCCTGGGTOT QATCCTAACC AAAGACAACT GCGCCC CTGA AGAGGTTATC 
TGGGAAGC6T TGAGTCTQAT GGGOGTGTAT GTTGGGAAGG AGCACATGTT CTACGGGGAG 
CCCAGGAAGC TGCTCACCCA AGATTGGOTO CRGOAAAACT ACCTGOAGTA CCGGCAGGTG 
CCCX3GCAGTG ATGCTGOGCA CTAOOAGTTC CTGTGGGOTT CCS^AGGOCCA COCTGAAACC 
AGCTATGAGA AGOTCATAAA TTATTTGOTC ATGCTCAATG CAAGAGAGCC CATCTOCTAC 
CX31TCCCTTT ATGAAGAG6T TTTGGGAGAG GAGCAAGAGG GAOTCTGA 



PCTAJS02/12476 



420 
480 
540 
600 
660 
720 
780 
840 
900 



Seq ID NO: 567 Protein sequence 
Protein Accession ftx NP_005356.1 

1 11 21 

I t 1 

HSI^RSPHC KPDEDLEAQG EDZXSIiMGAQB 
PQG6ASSSIS VYYTLWSQFD EOSSSQGEEB 
HKYRVKBPVT KAEMLBSVIK NYKRYPPVIP 
LGLSCDSMLG DGHSMPKAAL LIIVLGVILT 
PRKLLTQDWV QENYliEYRQV PGSDPAHYEF 
PSLYEEVLGE EQBQV 



31 41 51 

I t t 

PTGEEEBTT9 SSDSKEEEVS AAGSSSPPQS 
PSSSVDPAQL EFMFQEAUKL KVAELVHPLL 
QKASEPMQVI FGTDVKEVDP AGHSVILVTA 
KDNCAPEBVI WEALSVMGVY VGKEKNFYGB 
LHGSKAHAET SYEKVINYLV MLNAREPICY 



8eq ID NO: 568 DNA sequence 
Nucleic Acid Accession U: I1M_014400 
Coding sequence: 86.. 1126 



GGTTACTCAT 
GACGCCAAGG 
GATCTGGACT 
GTGCTACA6C 
GAAGTGGGCG 
CGGACAATTC 
CGGCCTGGAT 
CTOCAAOGCC 
ATACXX96CCC 
GGGTACATCG 
CTTCGACGGC 
CTGTGTCCAG 
TGGCTCCTGT 
CCCTCGAATC 
CACATCTGTC 
GCCAGOGCCA 
GGAGCCCAGG 
TCCTGCAAAA 
ATTGGCAGCC 
AAATTTCCCT 
CCCACCACTG 
CTTCTGCT6C 
QGGTGTTCTA 
TCCTCTTGTG 
A6GAT6CTAA 
GSTGGQACAA 
ATGGGTTOCC 
CTTATGTCTG 
TTGTATAQTG 



11 

I 

CCTGGGCTCA 
GAGCA06ACG 
GCAGGCTGGC 
TGCGTGGAGA 
CCX5GGCGTGG 
TCGCTGGCAQ 
CTTCAOGGGC 
AAGCTCAACC 
AACGGGGTGG 
CCGCCGGTOG 
AACGTCACCT 
GATGAATTCT 
T6CCAGGGGT 
CCACCCCTTG 
ACCACTTCTA 
ACCAGTCAOA 
TTGACTGGAG 
GGGGGGCCCC 
CTTCTQTTGG 
CTCACCTACT 
GACTGGGCTG 
GCTGGTTTGC 
GCTTTTTGAG 
ATGTTAGGAC 
GCTTCCTACT 
TG6CTCCX7CA 
CATATGTCTT 
TGTGTQATGA 



21 

1 

GGTAAGAGGG 
GAGCCA1X3GA 
TQCTGCTGCT 
AAGCAGATGA 
AC6TCTGCAC 
TGCSGGGTT6 
TTCTGGCGTT 
TCACCTCGCG 
AGTGCTACAG 
TGAGCTGCTA 
TGACGGCAGC 
GCACTCGGGA 
CCCGCltlTi'AA 
TCCGGCTGCC 
CCTCGGCCCC 
CTOCQAGACA 
GCGCCXSCTOG 
AGCAGCCCCA 
CCGTGGCTGC 
TCTCTGGCCC 
GCCCAGCCCC 

GGcrrrGGGA 

GACAGCTCCT 
AGAGTGAGAG 
CACTTTCTCC 
CTCTAAGCAC 
CCTTACTAGA 
QTTTCTGGCA 



31 



CCGGAGCTOS 
CCCCGCCAGG 
GCTGCTTCGC 
CGGATGCTCC 
06AGGCCGTG 
CQGTTOQGGA 
CATCCAfiCTO 
GGCGCTCGAC 
CTGTGTGGGC 
CAACGCCAGC 
TAATOTGACT 
TGGAOTAACA 
CTCTGACCTC 
CCCTCCAGAG 
AQTGAGACCC 
GOQAOTAGAA 
CCACCAGGAC 
TAATAAAGGC 
TGGTGTCCTA 
TGGGTACCCC 
TGTTTTTCCA 
AATAAAATAC 
GTATCCTTCT 
AA6TCAGCTG 
TAGCCA6CCT 
TGCCTGCCCT 
CT6TGA6CTC 
CATAAATGOC 



41 
I 

GAG6CX30CAC 
AAAGCAGGT6 
GGAGGAGCGC 
CCGAACAA6A 
GGGGCGGTGG 
CTCCCCGGCA 
CAGCAATQ09 
CCGGCAGGTA 
CTGAGCCGQG 
GATCATGTCT 
GTQTCCTTGC 
GGCCCAGGGT 
CGCAACAAGA 
CCCACGACTG 
ACATCCACCA 
CAOQAGGCXrr 
06CAGCAATT 
TGTGTGGCTC 
CTGTGAGCTT 
TCTTCTCATC 
ACATTCCXICA 
CGTTGTATAT 
CATCCTTGTC 
TCACGGGGAA 
GGACTTTGGA 
ACTCCCCGCA 
CTCGAGGGCA 
TCAATAAA6A 



51 

I 

ACCCAGGOGG 
CCCAG6CCAT 
AGGCCCTGGA 
TGAAGACAGT 
AGACCATCCA 
AGAATQACCQ 
CTCAGGATGO 
ATGAGAGTGC 
AGGCX3TGCCA 
ACAA6GGCTG 
CTGTCCGGG6 
TCACGCTCAG 
CCTACTTCTC 
TGGCCTCAAC 
CCAAACCCAT 
CCOGGGATOA 
CAGGGCAGTA 
CCRCAGCTGG 
CTCCACCTGG 
ACTTCCTGTT 
GTATCCCCAG 
ATTCT6GCAG 
TCTCCGCTTG 
GGTGAGAGAG 
GCGTGGGGTG 
TCTTTGGGGA 
GGGAOCGTGC 
TTTAATTACT 



Seq ID NO: 569 Protein sequence 
Protein Accession tt: HP_055215 



1 11 

I I 
MDPARXAGAQ AMIHTA6WLL 
CTBAVQAVET IBGQFSLAVX 

SRALDPAGNE SAYPPNGVEC 
AANVTVSLPV RGCVQDEFCT 
LPPPEPTTVA STTSVTTSTS 
A6HQDR8NS6 QYPAKGGPQQ 



21 



31 



41 
I 



IiLLLRGGAQA IiECYSCVQKA DDGCSPNKMK 
GC6SGLPGKN DRGLDLHGIiL AFIQLQQCAQ 
YSCVGLSREA CQGTSPPWS CVNASDHVYK 
RDGVTGPGPT LSGSCCQGSR CNSDLPNKTY 
APVRPTSTTK PMPAPTSQTP RQGVEHBASR 
PHNXGCVAPT AGLAALLLAV AAGVLIi 



Seq ID NO: 570 DNA sequence 
Nucleic Acid Accession #: NM_O0S329.1 
Coding sequence: I.. 1662 



ATGCCGGTGC 
GTGCTGGGTG 
CACTACCTGT 
CTTTTTGCCT 
TCCCCGCGGC 
TTGOGCAAGT 
GTGGTQGATG 
GGCGGCACCG 
GGTGAGACGG 
A6CACCTTCT 



11 
I 

AGCTGACGAC 
GCATCCT6GC 
CCTTCGGCCT 
TCCTGGAGCA 
GGG6CTCG6T 
GCCTGC6CTC 
GCAACCGCCA 
AGCAGGCCGG 
AGGCCAGCCT 
CGT6CATCAT 



21 
1 

AGCCCTGGGT 
AGCCTAT6TG 
GTACGGCGCC 
CCGGCGCATG 
GGCACTGTGC 
GGCCCAGCGC 
GGAGGAGGCC 
CTTCTTTGTG 
GCAGGAGGGC 
GCAGAAGTGG 



31 
1 

GT0GTGG6CA 
ACGGGCTACC 
ATCCTGGGCC 
CGACGTGCCG 
ATTGCCGCGT 
ATCTCCTTOC 
TACATGCTGG 
TGGCGCAGCA 
ATGGACCGTG 
GGAGGCAA6C 



41 



CCAaCCTGTT 
AGTTCATCCA 
TGCACCTGCT 
GCCAGGCCCT 
ACCAG6AGGA 
CTGACCTCAA 
ACATCTTCCA 
ACTTCCATGA 
TGCGGGATGT 
GGGAGGTCAT 



60 
120 
IBO 
240 
300 



60 
120 
IBO 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 



51 
I 

TVKCAPGVDV 
DRCNAKIiNLT 
GCFDGNVTLT 
FSPRIPPLVR 
DBEPRLTGQA 



60 
120 
180 
240 
300 



51 
I 

T6GCCTGGCA 
CACGGAAAAG 
CATTCAGAGC 
GAAGCTGCCC 
CCCTGACTAC 
GGTGGTCATG 
COAGGTGCTG 
GGCAGGCGAG 
GGTGCGGGCC 
GTACACGGCC 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
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TTCAAGGCCC TCGGCGATTC GGTGOACTAC ATCCAGGTGT GCQACTCTOA CACTQTGCTG 660 

GATCX3VG0CT 6CACCATCX3A GATQCTTOGA GTCCTG6AG0 AGQATCCCCA A6TAGGGGGA 720 

GTOQGGGGftG ATGTCCAGAT CCTCAACAAG TAOGACTCAT GGATTTCCTT CCTGAGCAGC 780 

OTGOGGTACT GGATOGCCTT CAACGTQOAG CXJQGCCTGCC AGTCCTACTT TGGCTGTGTO 840 

CAGTGTATTA GTGGQCCCTT GGGCATOTAC CX3CAACAGCC TCCTCXAGCA GTTCCTGGAG 900 

GACTOOTACC AXCAQAAGTT CCTAQGCA6C AAGTGCACCT TCGGGGATGA CCGGCACCTC 960 

ACCAACOGAG TCCTQAGCCT TGGCTACCGA ACTAAGTATA CXX3CGCGCTC CAAGTGCCTC 1020 

ACAOAOACCC CCACTAAGTA CCTCCX3GTGG CTCAACCAGC AAACCCGCTG GAGCAAGTCT 1080 

TACTTCC3GGQ AGTGGCTCTA CAACTCTCTG TGGTTCX3VTA AGCACXACCT CTGGATGACC 1140 

TACGAGTCAG TGGTCACGGG TTTCTTCCCC TTCTTCCTCA TTGCCAGGGT TATACRGCTT 1200 

TTCTACOGGG GCCGCATCTG GAACATTCTC CTCTTCXirrGC TGAC3GGTGCA GCTGGTGGGC 1260 

ATTATCAAGG CCACCTACGC CTGCTTCCTT CGGG6CAATG CAC5AGATGAT CTTCATGTCC 1320 

CrCTACTCCC TCCTCTATAT GTCCAGCCTT CTGCCGGCCA AQATCTTTGC CATTQCTACC 1380 

ATCAACAAAT CTGGCTGGGG CACCTCTGGC CGAAAAACCA TTQTGGTGAA CTTCATTGGC 1440 

CTCATTCCTG TGTCCATCTG 66T66CAGTT CTCCTGGQAG GGCTGGCCTA CACAGCTTAT 1500 

T6CCAGGACC T6TTCA6TGA GACAGAGCTA GCCTTCCTTG TCTCTGG66C TATACTGTAT 1560 

6GCT6CTACT GGGTG6CCCT CCTCAT6CCA TATCT6GCCA TCATCGGCX3B GGGATQTGGG 1620 
AAQAAGCOQG AQGAOTACAG CTT6SCTTTT GCTGA6Q1X3T OA 

seq ID KOt 571 Protein sequence 
Protein Accession ft: NP_005320.1 

1 11 21 31 41 51 

I I I I I I 

MFVQLTTAIiR WGTSLFALA VLGGIIAAYV TOYQFIHTEK RYLSFGLYGA ILGiaiiLZQS 60 

ZiFAFLEHRRM RRAGQALKLP SPRRGSVALC lAAYQEDFDY LRKCLRSAQR ISFPDIiKWM 120 

WDGNRQEDA YMLDIPHEVL GGTBQAGPPV WRSNFHEAGE GETEASLQEG MDRVRDWRA 180 

STFSCIMQKW GGKRBVMYTA FKAI/3DSVDY IQVCDSDTVL DPACTIEMLR VLEBDPQVGQ 240 

VGGDVQILNK YOSWISFLSS VRYHMAFNVK RAGQSYF6CV QCISGPIiGMY RNSLLQQFLE 300 

DPnrBQKFLaS KCSFGDDSHL TNRVLSUmi TKYTARSKCL TBTPTXCniRH LNQQTRHSKS 360 

YFREHLYNSL HFHKBBIAIMT yBSWTQFFP FFLXATVZQL PyRGRZHNIL LFLLTVQLVO 420 

IIKATYACFL RGNAEMIFHS LYSLLYNSSL LPAKIFAXAT ZNKS8WGTSG RKTIWKFZG 4B0 

LIPVSIWVAV I.LGGLAY13^Y GQDLFSBTBL AFLVSGAZLY GCYNVALLML YLAIZARR06 540 
RKPEQYSLAP AEV 

Seq ZD NO: 572 UNA sequence 

Nucleic Acid Accession #: Bos sequence 

Coding sequence t 148-7095 

1 11 21 31 41 51 

I I I I I I 

CACACATACX3 CACGCAOGAT CTCACTTOGA TCTATACACT 6GAG6ATTAA AACAAACAAA 60 

CAAAAAAAAC ATTTCCTTCG CTCCCCCTCC CTCTCCACTC TQAGAAGCAG AGGAGCCGCA 120 

CGGGQAGGGG C0GCAQACXX3 TCTGGAAATG CX3AATCCTAA AGCGTTTCCT CGCTTGCATT 180 

CAGCTCCTCT tfl'tfm' Ga x ; CCT66ATTGG GCXAATGGAT ACTACAGACA ACAGAGAAAA 240 

CTTGTTGAAG AGATTG6CTG GTCCTATACA GGA0CACT6A ATCAAAAAAA TTGGGGAAA6 300 

AAATATCCAA CaTGTAATAG CCCAAAACAA TCTCCTATCA ATATTGATGA AGATCTTACA 360 

CAAGTAAATG TGAATCTTAA GAAACTTAAA TTTCAGGGTT GGGATAAAAC ATCATTGGAA 420 

AACACATTCA TTCATAACAC TGGGAAAACA GTG6AAATTA ATCTCACTAA TQACTACCS3T 480 

GTCAGCG6AG GAGTTTCA6A AATGGTGTTT AAAGCAAGCA A6ATAACTTT TCACTGGGGA 540 

AAAT6CAATA TGTCATCTGA TGGATCAGA6 CATAGTTTAO AAGGACAAAA AT7TCCACTT 600 

6AGATGCAAA TCTACTGCTT TGATGOQGAC OGATTTTCAA GTTTTQAOGA AGCAGTCAAA 660 

GGAAAAGGQA AGTTAAQAGC TTTATCC3VTT TTGTTTGAGG TTGGGACAQA ASAAAATTTQ 720 

GATTTCAAAO OGATTATTGA TGGAQTOGAA AGTGTTAOTC GTTTTGGGAA GCAGGCTGCT 780 

TTAOATCCAT TCATACTGTT GAACCTTCTG CCAAACTCAA CTGACAAGTA TTAC31TTTAC 840 

AATG6CTCAT TGACATCTCC TCCCTGCACA GACACAGTTG ACTQGATTGT TTTTAAAGAT 900 

ACAGTTAGCA TCTCTGAAAG CCAGTTGGCT aTTTTTTGTG AAGTTCTTAC AATGCAACAA 960 

TCTGGTTATG TCATGCTGAT GGACTACTTA CAAAACAATT TTOSAGAGCA ACAGTACAAG 1020 

TTCTCTAGAC AGGTGTTTTC CTCATACACT GQAAAGOAAG AGATTCATGA AGCAGTTTGT 1080 

AGTTCAQAAC CAGAAAATGT TCAGGCTGAC CCAGAGAATT ATACCAGCXn' TCTTGTTACA 1140 

TGGGAAAGAC CTOGAGTOGT TTATGATACC ATGATTGAGA AGTTTGCAGT TTTGTACCAG 1200 

CAGTTGGAT6 GAGAG6ACCA AACCAAGCAT GAATTTTTGA CAGATGGCTA TCAAGACTTG 1260 

GGTGCTATTC TCAATAATTT GCTACCCAAT ATGAGTTATG TTCTTCAOAT AOTAGCCATA 1320 

TGCACTAATG GCTTATATGG AAAATACAGC GACCAACTQA TT6TCX3ACAT GCCTACTGAT 1380 

AATCCTGAAC TTQATCTTTT CCCTGAATTA ATTGGAACTQ AAGAAATAAT CAAGGAGGAG 1440 

OAAGAGGGAA AA6ACATTGA AGAAGGCGCT ATTGTGAATC CTGGTAGAGA CAGTGCTACA 1500 

AACCAAATCA GGAAAAAGGA ACCCCAGATT TCTACCACAA CACACTACAA TC6CATAGGG 1560 

ACGAAATACA ATQAAGCCAA GACTAACCGA TCCCC3MCAA GAGGAAGIGA ATTCTCTGGA 1620 

AAGQGTGATG TTCCCAATAC ATCTTTAAAT TCCACTTCCC AACCAGTCAC TAAATTAGCC 1680 

ACAQAAAAAG ATATTTCCTT GACTTCTCAG ACTGTGACTG AACTGCCACC TCACACTGTG 1740 

GAAGOTACTT CAGCCTCTTT AAATGATGGC TCTAAAACTG TTCTTAGATC TCCACATATG 1800 

AACTTGTGGG G6ACTGCAGA ATCCTTAAAT ACAGTTTCTA TAACAGAATA TQAGGAGGAG 1860 

AGTTTATTGA CCAGTTTCAA GCTTGATACT GGAGCIGAAG ATTCTTCAGG CTCCAQTCCC 1920 

GCAACTTCTG CTATCCCATT CATCTCTGAG AACATATCCC AAGGGTATAT ATTTTCCTCC 1980 

GAAAACCCAG AGACAATAAC ATATGATGTC CTTATACCAG AATCTGCTAQ AAATQCTTCC 2040 

GAAQATTCAA CTTCATCAGG TTCAGAA6AA TCACTAAAQG ATCCTTCTAT GGAGGQAAAT 2100 

GTGTG6TTTC CTAGCTCTAC AGACATAACA GCACAQCCC3G AT0TT6GATC AGGCAOAGAG 2160 

AGCTTTCTOC A6ACEAATTA CACT6AGATA GQTGTTGATG AATCT6AGAA GACAACCAAG 2220 

TCCTTTTCTG CAGGCOCAGT GATGTCACAG GGTCOCTGAG TTACAGATCT GGAAATSCCA 2280 

CATTATTCTA CCTTTGCCTA CTTCCCAACT GAGGTAACAC CTCATGCTTT TACCCCATCC 2340 

TCCAGACAAC AGGATTTGGT CTCCAOGGTC AACGTGGTAT ACTCGCAGAC AACCCAACCG 2400 

GTATACSU^TG GTGAGACACC TCTTCAACCT TCCTACAGTA GTGAAGTCTT TCCTCTA6TC 2460 

ACCCCTTTGT T6CTTGACAA TCAOATCCTC AACACTACCC CTGCTGCTTC AAGTAQTGAT 2520 

TC3GGCCTTGC ATQCTACGCC TGTATTTCCC AGTGTCGATO TGrCA-mOA ATCCATCCTG 2580 

TCTTCCTATG ATGGTGCACC TTTGCTTCCA TTTTCCTCTG CTTCCTTCAG TAGTGAATTO 2640 

TTTCGCCATC TGCATACAGT TTCTCAAATC CTTCCACAAG TTACTTCAGC TAC03AGAGT 2700 

GATAAGGT6C CCTT6CAT6C TTCTCTGCCA GTGGCTGG6G GTGATTTGCT ATTAGA6CCC 2760 
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AGCCTTGCTC AGTATTCTGA TGTOCTOTCC ACTACTCATG CTGCTTCAGA GAC3GCTGGAA 2820 
TTTGGTAGTG AATCTGGTGT TCTTTATAAA AOGCTTATGT TTTCTCAAGT TGAACCACOC 28B0 
AGCAQTGATQ CCATGATGCA TGCACSTTCT TCAGGGCCTG AACCTTCTTA TGCCTTGTCT 2940 
QATAATGAGQ GCTCCCAACA CATCTTCACT GTTTCTTACA GTTCTGCAAT ACCTGTGCAT 3000 
5 QATTCTOTGG GTGTAACTTA TCAGGGTTCC TTATTTAGCX3 GCCCTAGCCA TATACCAATA 3060 
CCTAAGTCTT CGTTAATAAC CCCAACTGCA TCATTACTGC AGCCTACTCA TGCCCTCTCT 3120 
QGTGATQGGG AATGGTCTGG AGCCTCTTCT GATAGTGAAT TTCTTTTACC TGACACAGAT 3180 
GQGCTGACAG CCCTTAACAT TTCTTCACCr GTTTCTSTAO CTGAATTTAC ATATACftACA 3240 
TCTGTGTTTG GTGATGATAA TAAGGCGCTT TCTAAAAGTG AAATAATATA TGGAAATQAG 3300 
10 ACTGAACTGC AAATTCCTTC TTTCAATGAG ATGQTTTACC CTTCTGAAAG CACAGTCATG 3360 
CCCAACATGT ATCATAATGT AAATAAGTT6 AATGCGTCTT TACAAQAAAC CTCT GTTTCC 3420 
ATTTCTA6CA CCAAGGGCAT GTTTCCAGGO TCCCTTOCTC ATACCACCAC TAAGGTTTTT 3480 
GATCATGAGA TTAGTCAAGT TCCAGAAAAT AACTTTTC3«5 TTCAAOCTAC ACATACTQTC 3540 
TCTCAA6CAT CTGGTGACAC TTCfGCTTAAA CCTGtGCTTA GTGCAAACTC A GAGC CAGCA 3600 
15 TCCTCTGACC CTCCTTCTAG TGAAATGTTA TCTCCTTCAA CTCAGCTCTT ATTTTATGAG 3 660 
ACCTCAQCTT CTTTTAGTAC TGAAGTATTG CTAC»ACCTT CCTTTCAGGC TTCTGATGTT 3720 
OACACCrrOC TTAAAACTQT TCTTCCRGCT OrOCCCAGTG ATCCAATATT GGTTGAAACC 3780 
CCCAAAGTTQ ATAAAATTAG TTCTACAATO TTGCATCTCA TTGTATCAAA TTCTGCTTCA 3840 
AGTGAAAACA TGCTGCACTC TACATCTGTA CCAGTTTTTG ATGTGTOSCC TACTTCTCAT 3900 
20 ATGCACTCTG CTTCACTTCA AGGTTTGACC ATTTCCTATG CAAGTGAQAA ATAT6AA0CA 3960 
GTTTTGTTAA AAAGTCAAAG TTCCCACXaA GTGGTACCTT CTTTGTACAQ TAATGATGA6 402 0 
TTGTTCCRAA CGGCCAATTT GGAGATTAAC CAGGCCCATC CCCCAAAAGG AAGGCATOTA 4080 
TTTGCTACAC CTCTrTTATC AATTGATCAA CCATTAAATA CACTAATAAA TAAGCTTATA 4140 
CATTCCGATG AAATTTTAAC CTCCACCAAA AGTTCT6TTA CTGGTAAGGT ATTTGCTGGT 4200 
25 ATTCCAACAG TTGCTTCTGA TACATTTGTA TCTACTGATC ATTCTGTTCC TATAGGAAAT 4260 
GQGCATGTT6 CCATTACAGC TGTTTCTCCC CACAGAGATG GTTCTQTAAC CTCAACAAAG 4320 
TTOCTGTTTC CTTCTAAGOC AACTTCTOAO CTGAGTCATA GTGCCAAATC TGATGCCGGT 4380 , 
TTAGTCGGTG GTGGTGAAGA TGOTGACACT GATGATGATG GTGATGATGA TGATGATGAC 4440 
AGAGGTAGTG ATGGCTTATC CATTCATAAO TGTAT6TCAT GCTCATCCTA TAGAGAATCA 4500 
30 CAGGAAAAGG TAAT6AATGA TTCAQACACX: CAOGAAAACA 6TCTTATGGA TC AGA ATAAT 4560 
CCAATCTCAT ACTCACTATC TGAGAATTCT GAAGAAGATA ATA6A6TCAC AAQTGTATCC 4620 
TCAGACAiSTC AAACTGGTAT GGACAGAAGT CCTGGTAAAT CACCATCAGC AAATGGGCTA 4680 
TCCCAAAAGC ACAATQATGO AAAAGAGGAA AATGACATTC AGACTGGTAG TGCTCTGCTT 4740 
CCrCTCAGCC CTGAATCTAA AGCATGGGCA GTTCTGACAA GXQATGAAQA AA GTGGAT CA 4800 
35 GGGCAAGGTA CCTCAGATAG CCTTAATGAG AATGAGACTT CCACftOATTT CASmTOCA 4860 
GACACTAATG AAAAAGATGC TGATGGGATC CTGGCAGCAG GTGACTOMSA AATAACTCCT 4920 
OOATTCCCAC AGTCCCCAAC ATCATCTGTT ACTAGCGAGA ACTCAGAAGT GTTCCAC3GTT 4980 
TCAGAGOCAG AGGCCAC5TAA TAGTAGCCAT GAGTCTCGTA TTGGTCTAGC TGAG GGGTTG 5040 
GAATCCGAGA AGAAGGCAGT TATACCCCTT GTGATCGTGT CAGCCCTGAC TTTTATCTGT 5100 
40 CTAGTCOTTC TT6TGGGTAT TCTCATCTAC TGGAQSAAAT GCTTCCaGAC TGCACACTTT 5160 
TACTTAGAGG ACAGTACATC CCCTAGAGTT ATATCCACAC CTCCAACACC TATCTTTCCA 5220 
ATTTCAGATG ATGTCGGAGC AATTCCAATA AAGCACTTTC CAAAGCATGT TGCAGATTTA 5280 
CATGCAAGTA GTGQOTTTAC TGAAGAATTT GAGACACTGA AAGAGTTTTA CCAGGAAGTG 5340 
CAGAGCTOTA CTGTT6ACTT AGGTATTACA GCAGACAGCT CCAACCACCC AGACAACAA6 5400 
45 CACAAGAATC GATACATAAA TATCGTTGCC TATGATCATA QCAGGGTTAA GCTAGCACaO 5460 
CTTGCTGAAA AGGATGGCAA ACTOACTGAT TATATCAAT6 CCAATTATOT TGATQGCTAC 5520 
AACAGACCAA AAGCTTATAT TGCT6CCCAA GGCCCACTQA AATCCACAOC TQAAGATTTC 5580 
TGGAGAATGA TATGGGAACA TAATGTGGAA GTTATTGTCA TGATAACAAA CCTCGTGGAQ 5640 
AAAGGAAGGA GAAAATOTGA TCAGTACTGG CCTGCOGATG QGAGTGAGGA GTACGGGAAC 5700 
50 TTTCTCOTCA CTCAGAAGAG TGTSCAAflTO CTTGCCIATT ATACTGTOftG GAATTTTACT 5760 
CTAAQAAACA CAAAAATAAA AAAGGGCTCC CAOAAAGGAA QACCCAGTCG ACGTGTGGTC 5820 
ACACAGTATC ACTACAOSCA GTGGCCTGAC ATGGGAGTAC CAGAGTACTC CCTGCCAGTG 5880 
CTGACCTTTG TGAGAAAGGC AGCCTATGCC AAGCGCCATG CAGTGGGGCC TGTTGTOGTC 5940 
CACTGCAGTG CTGGAGTTGG AA6AACAGGC ACATATATTG TGCTAGACAG TATGTTGCAO 6000 
. 55 CAOATTCAAC ACGAAGQAAC TGTCAACATA TTTOGCTTCT TAAAAC3^CAT CCX3TTCACAA 6060 
AGAAATTATT TGGTACAAAC TGAGGAGCAA TATGTCTTCA TTCATGATAC ACTGGTTGAG 6120 
GCCATACTTA 6TAAAGAAAC TGAGGTGCTG GACAGTCATA TTCATGCCTA TGTTAATGCA 6180 
CTCCTCATTC CTGGACCAGC AGGCAAAACA AAGCTAGAGA AACAATTCCA GCTCCTGAGC 6240 
CAGTCAAATA TACABCAGAG TGACTATTCT GCAGCCCTAA AGCAATGCAA CAGGGAAAAG 6300 
60 AATC3GAACTT CTTCTATCAT CCCTQTOQAA AGATCAAGGG TTGGCATTTC ATCCCTGAGT 6360 
GGAGAAGGCA CAGACTACAT CAATGCCTCC TATATCATGG QCTATTACCA GAGCAATGAA 6420 
TTCATCATTA CCCAGCACCC TCTCCTTCAT ACCATCAAQG ATTTCTGGAG GATGATATQG 6480 
GACCATAATG CCCAACTGGT GGTTATGATT CCTGATGGCC AAAACATGGC AGAAGATQAA 6540 
TTTGTTTACT GGCCAAATAA AGATQAGCCT ATAAATTQTG AGAQCTTTAA GGT CACTC TT 6600 
65 ATGGCTOAAG AACAOUU^TG TCTATCTAAT QAGGAAAAAC TTATAATTCA GGACTTTATC 6660 
TTAGAAOCTA CACAGQATOA TTATGTACTT GAAGTGAGGC ACTTTCAGTG TCCTAAATGG 6720 
CCAAATCCAG ATAGCCCCAT TAGTAAAACT TTTGAACTTA TAA6TGTTAT AAAAGAAGAA 6780 
GCTGCCAATA GGGATGGGCC TATGATTGTT CATGATGAQC ATGGAGGAGT QACGQCAGGA 6840 
ACTTTCTGTG CTCTGACAAC CCTTATGCAC CAACTAGAAA AAGAAAATTC CGTGGAT6TT 6900 
70 TACCaCGTAG CCAAGATGAT CAATCTGATG AGGCCAGGAG TCTTTGCTGA CATTGAGCAG S960 
TATCAGTTTC TCTACAAAGT GATCCTCAGC CTTGTGAGCA CAAGGCAGQA AGAGAATCCA 7020 
TCCACCTCTC TGGACAGTAA TGGTGCAGCA TTGCCTGATG GAAATATAGC TGAGAG CTTA 7080 
GA6TCTTTAG TTTAACACAG AAAGGGGTGG GGGGACTCAC ATCTGAGCAT TGTTTTCX:TC 7140 
TTCCTAAAAT TAGGCAGGAA AATCAGTCTA GTTCTGTTAT CTGTTGATTT CCCATCACCT 7200 
75 GACAGTAACT TTCATGACAT AGGATTCTGC CGCCAAATTT ATATCATTAA CAATGTGTGC 7260 
CTTTTTGCAA GACTTGTAAT TTACTTATTA TGTTTGAACT AAAATGATTG AATTTTACAG 7320 
TATTTCTAAG AATGGAATTG TGGTATTTTT TTCTGTATTG ATTTTAACAG AAAATTTCAA 7380 
TTTATAGAGG TTAGSAATTC CAAACTACAG AAAATGTTTG TTTTTAGTGT CAAATTTTTA 7440 
GCTGTATTTG TAOCAATTAT CAGGTTTGCT AGAAATATAA CTTTTAATAC AGTAGCCTGT 7500 
80 AAATAAAACA CTCTTCCATA TGATATTCAA CATTTTACAA CTGCAGTATT CACCTAAAGT 7560 
AOAAATAATC TGTTACTTAT TGTAAATACT GCCCTAGTGT CTCCATGGAC CAAATTTATA 7620 
TTTATAATTG TAQATTTTTA TATTTTACTA CTGAGTCAAG TTTTCTAGTT CTGTGTAATT 7680 
CTTTAOTTTA ATGAOGTAGT TCATTAGCTG OTCTTACTCT ACCAGTTTTC TGACATTGTA 7740 
TTGTGTTACC TAAGTCATTA ACTTTGTTTC AOCATOfTAAT TTTAACTTTT GTGGAAAATA 7800 
85 GAAATACCTT CATTTTGAAA GAAGTTTTTA TOAGAATAAC ACXTTTACCAA ACATTGTTCA 7860 
AATGGTTTTT ATCCAAGGAA TTGCAAAAAT AAATATAAAT ATTGCCATTA AAAAAAAAAA 7920 
AAAAAAAAAA AAAAAAAAAA AAAA 
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Seq ID NO I 573 Protein' sequence t 
Protein Accession Eos sequence 

1 11 21 31 41 51 

i I I I I I 

MRILKRFLAC IQLLCVCRLD WANGYYRQQR KLVEEIGWSY TGAUIQKNWG KKYPTCMSPK 60 

QSPINIDEDL TQVNVNLKKL KFQGWDKTSL ENTFIHNTGK TVEINLTNDY RVSGGVSEMV 120 

FKASKITFHW GKOmSSDGS EHSLEGQRFP LEMQIYCFDA DRFSSFBEAV KGKGKIiRALS 180 

ILFBVGTEEN LDFKAIIOGV ESVSRFGKQA ALDPPILMj LPNSTDKYYI YNGSLTSPPC 240 

TDTVDWIVFK DTVSISBSQL AVPCBVLTMQ QSGYVMLMDY LQNNFREQQY KPSRQVPSSY 300 

TGKEEIHEAV CSSEPENVQA DPENYTSLLV TWERPRWYD TMIEKFAVLY QQUJGBDQTK 360 

HBPLTDGYQD LGAILNNLLP NMSYVLQIVA ICTNQLYQKY SDQLIVDMPT DNPELDliPPB 420 

LIGTBBIIKE EEGGKDIBBO AZVNP6RSSA TNQIRKKEPQ ISTTTHYNRI GTKXNEAKTN 480 

RSFTRGSBFS GKGDVPHTSIt NSTSQFVTKL ATEKDZ3LTS QTVTELPPBT VBGTSA8U1D 540 

GSKTVLRSFR MtTLSGTAESL NTVSITEYEE ESLLTSFKLD T6AEDSSGSS PATSAIPFIS 600 

ENISQGYIFS SBNPETITYD VLIPESARNA SEDSTSSGSE ESLKDPSMEG NVWPPSSTDI 660 

TAQPDVGSGR BSFLQTNYTE IRVDESEKTT KSPSAGPVMS QGPSVTDLEM PHYSTFAYPP 720 

TEVTPHAFTP SSRQQDLVST VNWYSQTTQ FVYMGETPLQ PSYSSSVFPL VTPLliLDKQI 780 

UNTTFAASSS DSALHATFVF PSVDVSFESI LSSYDGAPLL PFSSA8F8SE ItFRHIiHTVSQ 840 

IIiPQVTSATE SDKVK<HA8L FVAGGDLLLB FSLAQYSDVL STTBAASETL EFGSBSGVLY 900 

KTLMPSQVEP PSSDAMMHAR SSGPEPSYAL SDNBGSQHIP TVSYSSAIPV HDSVGVTYQG 960 

SIiFSGPSHIP IPKSSIiITPT ASLLQPTHAL SGDGEWSGAS SDSEFLLPDT DGLTAUIISS 1020 

FV8VABFTYT TSVF6DDNKA LSKSBIZYGN ETELQIPSFN BKVYPSEETV ^4F21MYDNVNK 1080 

LNASLQETSV 8ISSTKQMFP GSLAHTTTKV FOHBZSQVPB HNPSVQPTKT VSQASGDTSL 1140 

KPVL8ANSEP ASSDPASSEM LSPSTQZiLFY ETSA8FSTEV LLQPSFQASD VDTLLKTVLP 1200 

AVPSDPILVE TPKVDKISST MLHLIVSNSA SSEiaMLHSTS VPVFDVSPTS HMHSASLQGL 1260 

TISYASEKYE PVLLKSESSH QWPSIjYSND ELFQTANLEI NQAHPPKGRH VFATPVLSID 1320 

EPLMTIilNKL IHSDEILTST KSSVTGKVFA GIPTVASDTF VSTDHSVPIG NGHVAITAVS 1360 

FBRZXSSVTST KIiIiFPSKATS EXaSBSAKSDA GLVGGGBDO} TIX3DGDDDDD PRQSpQLSIH 1440 

KCMSCSSYRE SQBKVKNDSO THENSLKDQN NPZSY8LSBK SBEDNRVTSV 8SDSQTGMDR 1500 

8PGKSPSAN6 LSQKHIIDGKB ENDIQTGSAL I.PL8PE8KAW AVLTSDEES6 SGQGTSOSLK 1560 

ENETSTDFSF ADTNEKDADG ILAAGDSEIT PGFPQSPTSS VTSENSEVFH VSEAEASNSS 1620 

HESRI6LAE6 LBSEXKAVZP LVZVSALTFI CLWLVGIItl YWRKCFQTAE FYLEDSTSPR 1680 

VZ8TPPTPIF PZ8D0VaAIP ZRBFPXHVAD liRASSGFTEB FBTLKBFYQB VQSCTVDLGZ 1740 

TADSSNHPDN RHKNRYZHZV AYDHSRVKLA CLAEKDOKLT DYZKARYVZX3 YNRPRAYZAA 1800 

QQPLKSTAED FWRMIWEHNV EVIVMITNLV EKGRRKCDQY WPADGSEEYG NFI*VTQKSVQ 1860 

VLAYYTVHNF TLRNTKZKKG SQKGRPSGRV VTQYHYTQWP DMGVPEYSLP VLTFVRKAAY 1920 

AKRHAV6PW VBCSAGVQRT GTYIVLDSNL QQZQHEGTVN ZFGFLXHIRS QRNYLVQTEE 1980 

QyVFZRDTLV EAZLSKBTEV LDSHZHAYVN AIiLZFGPAOX TKLBKQFQIjL SQ8NZQQSDY 2040 

SAALXQCNRE KNRTSSZZFV ERSRV0Z86L SGEGTDYINA SYZMSYYQSIT EFZZTQHFLL 2100 

HTZKCFWRMI WDHHAQLVVM IPDGQNMABD BFVYWPNKDB PZNCESPKVT UftEEHKOiS 2160 

NEEKLIIQDF ILEATQDDYV LEVRHFQCPK WPNPDSPISK TFELISVIKE BAANRDGPMZ 2220 

VBDEHGGVTA GTFCALTTIiM KQLEKBtT8VD VYQVAKNZNL ^4RP6VFADZE QYQFLYKVZI« 2280 
SLVSTRQBEH PSTSU)S1IGA ALPD6NZABS LBSLV 

Seq ID NO: 574 DHA sequence 

Nucleic Acid Accession 8t Bos sequence 

Coding sequence: 148-4518 

1 II 21 31 41 51 

I I I I 1 ) 

CACACATACG CACGCACX3AT CTCACTTOSA TCTATACACT GGAGQATTAA AACAAACAAA 60 

CAAAAAAAAC ATTTCCTTC3G CTCCCCCTCC CTCTCCACTC TGAGAAGCAG AGGAGCCGCA 120 

0QG08AG6Q0 OCGCAGACOG TCTGGAAATG CGAATCCTAA AGCGTTTCCT CX3CTTGCATT 180 

CAGCTCCTCT GOTIT GC U Q CCTOaATTGQ GCTAATGGAT ACTACAGACA ACAGAGAAAA 240 

CTTOTTGAAG AGATTGGCTG GTCCTATACA GQAGCACTGA ATCAAAAAAA TTGGGGAAAG 300 

AAATATCCAA CATGTAATAG CCCAAAACAA TCTCCTATCA ATATTGATGA AGATCTTACA 360 

CAAGTAAATG TGAATCTTAA GAAACTTAAA TTTCAGGGTT GGGATAAAAC ATCATTGGAA 420 

AACACATTCA TTCATAACAC TGGQAAAACA GTGGAAATTA ATCTCACTAA TGACTACCGT 480 

GTCAOOOOAG GAGTTTCAGA AATGGTGTTT AAAGCAAGCA AGATAACTTT TCACTGGGQA 540 

AAATGCAATA TGTCATCTGA TGGATCAGAG CATAGTTTAG AAGQACAAAA ATTTCCACTT 600 

GAGATGCAAA TCTACTGCTT TGATGCGGAC CGATTTTCAA GTTTTGAGGA AGCAGTCAAA 660 

GGAAAAGGGA AGTTAAGAGC TTTATCCATT TTGTTTGAGG TTGGGACAOA AGAAAATTTG 720 

GATTTCAAAG OGATTATTGA TGGAGTCX3AA AGTGTTAOTC GTTTTGGGAA GCAGGCTGCT 7 BO 

TTAGATCCAT TCATACTGTT GAACCTTCTG CCAAACTCAA CTGACAAGTA TTACATTTAC 840 

AATGGCTCAT TQACATCTCC TCCCTGCACA 6ACACAGTTG ACTGGATTGT TTTTAAAGAT 900 

ACAGTTAGCA TCTCTGAAAG CCAGTTGGCT GTTTTTTGTC AAOTICTTAC AATGCAACAA 960 

TCTGGTTATG TCATGCTGAT GGACTACTTA CAAAACAATT TTCGAGAGCA ACAGTACAAG 1020 

TTCTCTAGAC AGGTGTTTTC CTCATACACT GGAAAGQAAG A6ATTCATGA AGCAGTTTGT 1080 

AGTTCAGAAC CAGAAAATGT TCAGGCTQAC CCAGAGAATT ATACCAGCCT TCTTGTTACA 1140 

TGGGAAAGAC CTGGAGTCGT TTATGATACC ATGATTGAGA AGTTTOCAGT TTT6TACCAG 1200 

CAGTTGGATG GAGAGGACCA AACCAAGCAT GAATTTTTQA CAGATGGCTA TCAAGACTTG 1260 

GGTGCTATTC TCAATAATTT GCTACCCAAT ATGAOTTATQ TTCTTCAaAT AGTAGCCATA 1320 

TGCACTAATG GCTTATATGG AAAATACAGC QACCAACTGA TTGTCGACAT GCCTACTGAT 1380 

AATCCTQAAC TTGATCTTTT CCCTGAATTA ATTGGAACTG AAGAAATAAT CAAGGAGGAG 1440 

GAAGA6QQAA AAGACATTGA AQAAGGOGCT ATTGTGAATC CTGGTAGAGA CA6TGCTACA 1500 

AACCAAATCA GGAAAAAGGA ACCCCAGATT TCTAOCACAA CACACTACAA TOQCATAGGG 1560 

AC3GAAATACA ATQAA6CCAA GACTAACOGA TCCCCAACAA GAGGAAGTGA ATTCTCTGGA 1630 

AAGGGTGATG TTCCCAATAC ATCTTTAAAT TCCACTTCCC AACCAGTCAC TAAATTAGCC 1680 

ACAGAAAAAG ATATTTCCTT GACTTCTCAG ACTGTGACTG AACTGCCACC TCACACTGTG 1740 

6AAGGTACTT CAGCCTCTTT AAATGATGGC TCTAAAACT6 TTCTTAQATC TCCACATATG 1800 

AACTTGTOGG GGACTGCAjSA ATCCTTAAAT ACAOTTTCTA TAACAQAATA TGftOGAGGAG 1860 

AGTTTATT6A CXAGTTTCAA 6CTTGATACT GGAOCTGAAG ATTCTTCAGG CTCCAG TCCC 1920 

GC3UVCTTCTG CTATCCCATT CATCTCTQAG AACATATCCC AAGGGTATAT ATTTTCCTCC 1980 

GAAAACCCAG AGACAATAAC ATATGATGTC CTTATACXAG AATCTGCTAG AAATGCTTCC 2040 

6AAGATTCAA CTTCATCA6G TTCAGAAGAA TCACTAAAGG ATCCTTCTAT GGAGGQAAAT 2100 
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GTaTGGTTTC CTAGCTCTAC AGACATAACA GCACA0CCOS ATGTTGQATC AGGCAOAGAG 2160 
AGCTTTCTCC AGACTAATIA CACTQAGATA COTOTTQATO AATCTOAOAA 6ACAA0CAAG 2220 
TCCTTTTCTG CABGCCCAGT QATOTCACAG G6TCCCTCA0 TTACAOATCT GOAAATGCCA 2280 
CATTATTCTA CCTTT6CCTA CTTCCCAACT GAGGTAACAC CTCATGCTTT TACCCCATCC 2340 
ICCAGACAAC AQGATTTGGT CTCCAC6GTC AACGTGGTAT ACTCGCAGAC AACCCAACCG 2400 
GTATACAATG CAGAG6CCAG TAATAGTAGC CATGAGTCTC GTATTGGTCT AG CTGAG GGG 2460 
TTGGAATCXX3 AGAAOAAGGC AGTTATACCC CTTGTGATCG TGTCAGCCCT GACTTTTATC 2520 
TGTCTAGTG6 TTCTTGTGGG TATTCTCATC TACTGGAGGA AATGCTTCCA GACTGCACAC 25B0 
TTTTACTTAG AGGACAGTAC ATCCCCTAGA GTTATATCCA CACCTCCAAC ACCTATCTTT 2640 
CCAATTTCAG ATGATGTCX3G AGCAATTCCA ATAAAGCACT TTCCAAAGCA TGTTGCAGAT 2700 
TTACATGCAA GTAGTG6GTT TACTGAAGAA TTTGAGACAC TQAAAGAQTT TTACCAGGAA 2760 
OIGCAGRGCT GTACTGTTGA CTTAGGTATT ACAGCftiSACA GCTCCAACCA CCCAGACAAC 2820 
AAGCACAAGA ATCGATACAT AAATATOGTT GCCTATOATC ATAGCAGGGT TAAGCTAGCA 2880 
CAGCTTGCTG AAAAGGATGG CAAACTGACT GATTATATCA ATGCCAATTA TGTTGATGGC 2940 
TACAACAGAC CAAAAGCTTA TATTGCTGCC CAAQGCCCAC TGAAATCCAC AGCTGAAGAT 3000 
TTCTGGAGAA TGATATGGGA ACATAATGTG GAAGTTATTG TCATGATAAC AAACCTCGTG 3060 
6AGAAA6GAA G0A£5AAAAT0 TGATCaGTAC TGGCCTGCCG ATGGGAGTGA GGAGTACGGG 3120 
AACTTTCTGG TCACTCAGAA OAOTOTGCAA GT0CTT6CCT ATTATACTGT GAGGAATTTT 3180 
ACTCTAAGAA ACACAAAAAT AAAAAAGOQC TCCCAGAAAG GAAGACCCAG TGGACGTGTG 3240 
6TCACACAGT ATCACTACAC OCAGTGGCCT GACATGGGAG TACXA6A6TA CTCXICTGCCA 3300 
GTGCTGACCT TTGTOAGAAA GGCAGCCTAT GCCAAGCGCC ATGCAGTGGG GCCTGTTGTC 3360 
QTCCACTGCA OTGCTGOAGT TGGAAGAACA GGCACATATA TTGTGCTAGA CAGTATGTTG 3420 
CA0C3WSATTC AACA06AAGG AACTGTCAAC ATATTTGGCT TCTTAAAACA CATCCGTTCyV 3480 
CAAAGAAATT ATTTGGTACA AACTGAGGAG CAATATGTCT TCATTCATGA TA CACTGG TT 3540 
GAGGCCATAC TTAGTAAAGA AACTGAGGTG CTGGACAGTC ATATTCATOC CTATGTTAAT 3600 
6CACTCCTCA TTCCTGGACC AGCAGGCAAA ACAAAGCTAG AOAAACAATT OCAGCTCCTG 3660 
AGOCAGTCAA ATATACAGCA GAGTGACTAT TCTGCAGCCC TAAAGCAATG CAACAGGGAA 3720 
AAGAATCGAA CTTCTTCTAT CATCCCTGTG GAAAGATCAA GGGTTGGCAT TTCATCCCTG 3780 
AGTGGAGAAG GCACAGACTA CATCAATGCC TCCTATATCA TGGGCTATTA CCAGAGCAAT 3840 
GAATTCATCA TTACCCAGCA CCCTCTCCTT CATACCATCA AGOATTTCTO GAflGAIGATA 3900 
TGGGACCATA ATGCCCAACT GGTGGTTATQ ATTCCTGATG OCCAAAACAT G6CA0AA0AT 3960 
QAATTTOTTT ACTOGCCAAA TAAAGATGAG CCTATAAATT GTGAGAGCTT TAAGGT CACT 4020 
CTTATGGCTG AAGAACACAA ATGTCTATCT AATGAGGAAA AACTTATAAT TCAGGACTTT 4O80 
ATCTTAGAAG CTACACAGQA TGATTATGTA CTT6AAGT6A GGC31CTTTCA GT6TCCTAAA 4140 
TGGCCAAATC CAGATAGCCC CATTAGTAAA ACTTTTGAAC TTATAAGTGT TATAAAA6AA 4200 
GAAGCTGCCA ATAQQGATGG GCCTATGATT GTTCATGATG AGCAT6GAGG AGTGACGGCA 4260 
GGAACTTTCT OTGCTCTGAC AACCCTTATG CACCAACTAG AAAAAGAAAA TTCCGTQGAT 4320 
GTTTACCAGG TAGCCAAGAT GATCAATCTG ATGAGGCCA6 GAGTCTTTGC TGACATTGAQ 4380 
C3W3TATCAGT TTCTCTACAA AGTGATCCTC AGCCTTGTGA GCAC31AGGCA GGAAGAGAAT 4440 
CCATCCACCT CTCTGGACAO TAATGOTGCA GCATXGCCTO ATOOAAATAT A OCTGAGAfl C 4500 
TTAGAGTCTT TAGTTTAACA CA6AAAGGGG TGQGGGGACT CACATCTGAG CATTQTTTTC 4560 
CTCTTCCTAA AATTAGGCAG GAAAATCAGT CTAGTTCTGT TATCTGTTGA TTTCCCATCA 4620 
CCTGACAGTA ACTTTCATGA CATAGGATTC TGCCGCCAAA TTTATATCAT TAACAATGTG 4680 
TGCCTTTTTG CAAGACTTOT AATTTACTTA TTATGTTTGA ACTAAAATGA TTGAATTTTA 4740 
CAGTATTTCT AAOAATGQAA TTOTGGTATT TTTTTCTGTA TTGATTTTAA CAQAAAATTT 4800 
CAATTTATAO AGGTTAGGAA TTCCAAACTA CAOAAAATQT TTGTTTTTAG TQTCAAATTT 4860 
TTAGCTGTAT TTGTAGCAAT TATCAGGTTT GCTAGAAATA TAACTTTTAA TACAQTAGCC 4920 
TGTAAATAAA ACACTCTTCC ATATGATATT CAACATTTTA CAACTGCAGT ATTCACCTAA 4980 
AGTAGAAATA ATCR3TTACT TATTGTAAAT ACTGCCCTAO T6TCTCCATQ GACCAAATTT 5040 
ATATTTATAA TTCTAGATTT TTATATTTTA CTACTGAGTC AAfiTTTTCTA GTTCTGTGTA 5100 
ATTGTTTAGT TTAATGACGT AGTTCATTAG CTGGTCTTAC TCTACCAGTT TTCTOACATT 5160 
GTATTGTCTT ACCTAAGTCA TTAACTTTGT TTCSVGCATGT AATTTTAACT TTTGTGGAAA 5220 
ATAGAAATAC CTTCATTTTG AAAGAAGTTT TTATQAGAAT AACACCTTAC C3UUVCATTGT 5280 
TCAAATGGTT TTTATCCAAG QAATTGCAAA AATAAATATA AATATTGCCA TTAAAAAAAA 5340 
AAAAAAAAAA AAAAAAAAAA AAAAAAA 



Seg ID NO: 575 Protein sequence : 
Protein Accession #: Eos sequence 

I 11 21 31 41 51 

MRILKRPIAC IQLLCVCRIiD WANGYYRQQR KLVBEIGWSY TGAWQKHWG KKYPTCMSPK 60 

QSPIMIDSDL TQVNVNLKKL KFQGWDKTSL BNTPIHNTGK TVBINLTNDY RVSGGVSBMV 120 

PKASKITFHW GKCNMSSDGS EHS1.BGQKFP LEMQ1YC3?DA DRFSSFBEAV KGKGKLRALS 180 

IIiFEVGTEBM LDFKAIID6V ESVSRFGKQA ALDPFILLNL LPNSTDIOfYI YNGSLTSPPC 240 

TOTVDWTVFK DTVSISESQL AVPCBVLTMQ QSGYVMLMDY LQHNFRBQQY KPSRQVFSSY 300 

TGKEBIHEAV CSSEPENVQA DPBMYTSLLV TWERPRWYD IMIEKFAVLY QOLDGBDQTK 360 

HBFI.TDGYQD LQAILNNLLP NMSYVLQIVA ICTHGLYGKY SDQliIVDMPT DNPELDLPPB 420 

LIGTBEIIKE EEEGKDIEBG AIVNPGRDSA TNQIRKKEPQ ISTTTHYNRI GTKYNEAKTN 480 

RSPTRGSBFS GKGDVPNTSIi NSTSQPVTKL ATEKDISLT8 QTVTBLPPHT VEGTSASIjND 540 

GSKTVLRSPH MNLSQTAESL NTVSITBYEB ESLLTSPKLD TQAEDSSGSS PATSAIPFIS 600 

EHISQGYIFS SEMPETITYD VI.IPESARMA SEDSTSSGSB ESLKDPSMBG NVWFPSSTDI 660 

TAQPDVGSGR ESPLQTNYTE IRVDESBKTT KSPSAGPVMS QGPSVTDLEM PHYSTEAYPP 720 

TEVTPHAFTP SSRQQDLVST VNWYSQTTQ PVYKAEASNS SHESRIGLAE GLESSKKAVI 780 

PLVrVSALTF ICLWLVGIL lYWRKCFQTA HPYIiEDSTSP RVISTPPTPI FPISDDVGAI 840 

PIKHFPKHVA DLHASSGFTE EPBTLKEFYQ EVQSCTVDI.G ITADSSNHPD NKHKNRYINI 900 

VAYDHSRVKL AQIABKDOKL TDYINANYVD GYNRPKAYIA AQQPLKSTAB DFWRMIWEHN 960 

VEVIVMITOIi VEKGRRKCDQ YWPAD6SEEY GNFLVTQKSV QVLAYYTVBN FTLRMTKIKK 1020 

GSQKGRPSGR WTQYHYTQW PDMGVPEYSL PVLTFVRKAA YAKRHAVGPV WHCSAGVGR 1080 

TGTYIVLDSM LQQIQHEGTV NIPGFLKHIR SQRNYLVQTB EQYVPIHDTL VBAILSKBTE 1140 

VLDSHIHAYV NALLIPGPA6 KTKLEKQFQL l£QSKIQQSD YSAALKQCNR EKNRTSSIIP 1200 

VBRSRVGISS LSSEQTDYIN ASYIMGYYQS NEPIXTQHPL LHTIKDFWRM IWDHMAQIiW 1260 

MIPDGQHMAE DEPVYWESKD BPIKCBSFKV TLMABEHKCL SNEEKLIIQD FILEATQDDY 1320 

VLEVRHFQCP KWPNPDSPIS KTFEI.ISVIK EBAAHRDQPM IVBDEHGGVT AGTPCALTTL 1380 

MHQLEKENSV DVYQVAKMIN IMRPGVFADI BQYQPI.YKVI LSI.VSTRQBB NPSTSLDSMG 1440 
AALPDGNIAE SLESLV 
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Seq ID NO: 576 DNA sequence 

Nucleic Acid Accession #: EOS sequence 

r#y^^ng sequence t 148-4494 

1. 11 21 31 41 51 

1111)1 

CaCACATACG CAOGCACGAT CTCACTTCQA TCTATACACT GGftGGATTAA AACAAACAAA 60 

CAAAAAAAAC ATTTCCTTCJG CTCCCCCTCC CTCTCCACTC TG AGAAG CAG AGGAGCCGCA 120 

OGGCXaWSGGG CCGCAGACCG TCTGGAAATG CGAATCCTAA AGCGTTTCCT CGCTTGCATT 180 

CAGCTCCTCT GTQTTTGCCG CCTGQATTGG GCTAATGGAT ACTACAGACA AC3VGAGAAAA 240 

CTTOTTCSAAa AOATTGOCTG GTCCTATACA GGA6CACXGA ATCAAAAAAA TTGGGGAAAG 300 

AAATATCCAA CATGTAATAO CCCAAAACAA TCTCCTATCA ATATTGATGA AGATCTTACA 360 

CAAGTAAATG TGAATCTTAA GAAACTTAAA TTTCAGGGTT GGQATAAAAC ATCATTGGAA 420 

AACACATTCA TTCATAACAC TGGGAAAACA GTGGAAATTA ATCTCACTAA TQACTACCGT 4B0 

GTCAGCX3GAG GAGTTTCASA AATGGT6TTT AAAGCAAGCA AGATAACTTT TCACTGGGGA 540 

AAATOCAATA TQTCATCTQA TGOATCAGAG CATAGTTTAG AAQGACAAAA ATTTCCACTT 600 

QAGATCC3WA TCTACTGCTT TQATGCACSAC CQATrTTCAA GTTTTGAOGA AGCAGTCAAA 660 

QGAAAAGGGA AGTTAAGAGC TTTATCCATT TTGTTTGAGG TTGGGACAGA AGAAAATTTC 720 

QATTTCAAAG CQATTATTGA TGGAGTCGAA AGTGTTAGTC GTTTTGOOAA GCAG GCTGC T 780 

TTAGATCCAT TCATACTGTT GAACCTTCTG CCAAACTCAA CTQACAAGTA TTACATTTAC 840 

AATGGCTCAT TQACATCTCC TCCCTGCACA QACACAGTTG ACTGGATTGT TTTTAAAGAT 900 

AC3U3TTAOCA TCTCTQAAA6 CCAGTTGGCT GTTTTTTGTG AAGTTCTTAC AATGCAACAA 960 

TCTGGTTATG TCATGC3X3AT GQACTACTTA CAAAACAATT TTCGAGAGCA ACAG TACAAG 1020 

TTCTCTAGAC AQGTQTTTTC CTCATACACT GGAAAG6AAG AGATTCATGA AQCAGTTTOT 1080 

A6TTCAGAAC CAGAAAATGT TCAQGCTGAC CCAGAGAATT ATACCA6CCT TCTTG TTACA 1140 

TG6GAAAGAC CT06AGTCGT TTATGATACC ATGATTGAGA AQTTT6CAGT TTTGTACCAG 1200 

CaGTTCGATG GAGAGGACCA AACCAAGCAT GAATTTTTGA CAGATGGCTA TCAAGACTTG 1260 

GGTGCTATTC TCAATAATTT GCTACCCAAT ATGAGTTATG TTCTTCAGAT AGTAGCCATA 1320 

TGCACTAATG GCTTATATGG AAAATACAflC GACCAACIGA TTOTOOACAT GCCTACTQAT 1380 

AATCCTGAAC TTGATCTTTT CCCTGAATTA ATTGGAACTG AAGAAATAAT CAAGGAGGAG 1440 

GAASAGGGAA AAGACATTQA AGAAGGCGCT ATT G T G AATC CTGGTAGAGA CAGTGCTACA 1500 

AACCAAATCA GGAAAAAGQA ACCCCAGATT TCTACCACAA CACACTACAA TCGCATAGGG 1560 

ACGAAATACA ATGAAGCCAA GACTAACXXSA TCCCCAACAA GAGGAAGTGA ATTCTCTGGA 1620 

AAGGGTGATG TTCCCAATAC ATCTTTAAAT TCCACTTCCC AACC3U5TCAC TAAATTAGCC 1680 

ACA6AAAAAG ATATtTCCTT GACTTCTCAG ACIGTGACTG AACTGCCACC TCACACTGTG 1740 

GAAGGTACTT CAGCCTCTTT AAATGATGGC TCTAAAACTG TTCTTAQATC TCCACATATO 1800 

AACTTGTCGG GGACTQCAGA ATCCTTAAAT ACAGTTTCTA TAACAGAATA TGAGGAGGAG 1860 

AGTTTATTGA CC3iGTTTCAA GCTTGATACT GGA6CTOAAG ATTCTTCAGG CTCCAG TCCC 1920 

GCAACTTCTG CTATCOCATT CATCTCTOAO AACATATCCC AAOOGTATAT ATTTTCCTCC 1980 

GAAAACCCAO AGACAATAAC ATATQATGTC CTTATAOCAS AATCTGCTAG AAATGCTTCC 2040 

GAAGATTCAA CTTCATCAGG TTCAGAA6AA TCACTAAAGG ATCCTTCTAT GGAGGGAAAT 2100 

GTGTGGTTTC CTAGCTCTAC AQACATAACA GCACAGCCOG ATGTTGGATC AGGCAGAGAG 2160 

AGCTTTCTCC AGACTAATTA CACTGA6ATA OGTGTTGATG AATCT6AGAA GACAACCAAG 2220 

TCCTTTTCTG CAOGCOCAOT GATGTCACAG GGTCCCTrCAO TTACAQATCT GGAAATGCCA 2280 

CATTATTCPA CCTTTGCCTA CTTCCCAACT QAOGTAACAC CTCATGCTTT TACCCCATCC 2340 

TCCAGACAAC AGGATTTGGT CTCCA06GTC AAOGTGGTAT ACTCGCAGAC AACCCAACCG 2400 

6TATACAATG AGGCCAGTAA TAGTAGCCAT GAGTCXOSTA TTGGTCTAGC TGA6GGGTTG 2460 

GAATCC8AGA AQAAGQCAGT TATACCCCTT GTGATCGTGX CAGCCCTGAC TTTTAT CTGT 2520 

CTAGTCGTTC TTQTGGGTAT TCTCATCTAC TGGAGGAAAT GCTTCCRGAC TGC ACAC TTT 2580 

TACTTA6AG0 ACAGTACATC CCCTAGAGTT ATATCCACAC CTCCAACACC TATCTTTCCA 2640 

ATTTCAGATG ATGTCGGAGC AATTCCAATA AAGCACTTTC CAAAGCATGT TGCAGATTTA 2700 

CATGCAAGTA GTGGGTTTAC TGAAGAATTT GAGGAAGTGC AGAGCTQTAC TGTTQACTTA 2760 

GGTATTACAG CAGACA6CTC CAACCACCCA GACAACAAGC ACAAGAATCG ATACATAAAT 2B20 

ATG6TTGCCT ATGATCATAG CAGGGTTAAG CTAGGACAGC TTGCTGAAAA GGATX3GCAAA 2880 

CTGACTGATT ATATC3UITGC CAATTATGTT GATGOCTACA ACAQACCAAA AGCTTATATT 2940 

GCIGCCCMG GCCCACTGAA ATOCACAGCT GAAQATTTCT GQAQAATQAT ATGGQAACAT 3000 

AATGTGGAAG TTATTGTCAT GATAACAAAC CTOGTGGAGA AAGGAAGGAG AAAATGTGAT 3060 

CAGTACTGGC CT60CGATGG GAGTGAGQA6 TAOGGGAACT TTCTGGTCAC TCAGAAGAGT 3120 

QTGCAAGTOC TTGCXSATTA TACTOTGAGG AATTTTACTC TAAQAAACAC AAAAATAAAA 3180 

AAGGGCTCCC AGAAAGGAAG ACCCaGTGGA OGTGTGGTCA CACAGTATCA CTACACGCAG 3240 

TGGCCTGACA TGGGAGTACC AGAGTACTCC CTGCCAGTGC TGACCTTTGT GAG AAAGG CA 3300 

GCCTATGCCA AGOGCCATGC AGTGGGGCCT GTTGTCGTCC ACTGCAOTGC TQGAOTTGOA 3360 

AGAACAGGCA CATATATTGT GCTAGACAGT ATGTTGCAGC AGAITCAACA GGAAGGAACT 3420 

GTCAACATAT TTGGCTTCTT AAAACACATC CGTTCACAAA QAAATTATTT GGTACAAACT 3480 

6AGGAGCAAT ATGTCTTCAT TCATGATACA CTGGTTGAGQ CCATACTTAG TAAAGAAACT 3540 

GAGGTGCTGG ACAGTCATAT TCATGCCTAT GTTAATGCAC TCCTCATTCC TGGACCAGCA 3600 

GGCAAAACAA AGCTAGAGAA ACAATTCXAG CTCCTGAGCC AGTCAA ATAT ACAGCAGAGT 3660 

GACTATTCTO CAGCCCTAAA GCAATGCAAC AGGGAAAAOA ATCQAACTTC TTCTATCATC 3720 

CCTGTGGAAA GATCAAGGGT TGGCATTTCA TCCCTGAGTG GAGAAGGCAC AGACTACATC 3780 

AATGCXrrCCr ATATCATGGG CTATTACCAG AGCAATGAAT TCATCATTAC CCAGCACCCT 3840 

CTCCTTCATA CCATCAAGGA TTTCTGGAGG ATGATAT6GG ACCATAATGC CCAACTGGTG 3900 

GTTATGATTC CTGATGGCCA AAACATGGCA GAAGATGAAT TTGTTTACXG GCCAAATAAA 3960 

GATGAQCXTTA TAAATTGTGA OAGCTTTAAG* GTCACTCTTA TGGCTGAA6A ACACAAATGT 4020 

CTATCTAATG AGGAAAAACT TATAATTCAG GACTTTATCT TAGAAGCTAC ACAGGATGAT 4080 

TATGTACTTG AAGTGAGGCA CTTTCAGTGT CCTAAATGGC CAAATCCAGA TAGCCCCATT 4140 

AGTAAAACTT TTGAACTXAT AAGTGTTATA AAAGAAGAAG CTGC CAATAQ GGATGGGCXrr 4200 

ATGATTGTTC ATGATGAGCA TGGAGGAGTG AGGGCAOGAA CTTTCT8TQC TCTGACAACC 4260 

CTTATGCACC AACTAGAAAA AGAAAATTCC GTGGATGTTT ACCA6GTAGC CAAGATGATC 4320 

AATCTGATGA GGCCAGGAGT CTTTGCTGAC ATTGAGCAGT ATCAGTTTCT CTACAAAGTG 4380 

ATCCTCAGCC TTGTGAGCAC AAGGCAGGAA GAGAATCCAT CCACCTCTCT GGACAGTAAT 4440 

G6TGCAGCAT TGCCT6ATGG AAATATAGCT GAQAGCTTAG AGTCTTTAGT TTAACACA6A 4500 

AAGG6GT6GG G6QACTCACA TCTGAGCATT OTTTTCCTCT TCCTAAAATT AGGGAGGAAA 4560 

ATCAGTCTAG TTCTQTTATC TGTTGATTTC CCATCACCTQ ACAgTA ACTT TCATGACATA 4620 

GGATTCTGCC QCCAAATTTA TATCATTAAC AATGTGTGCC TTTTTGCAAG ACTT6TAATT 4680 

TACTTATTAT GTTTGAACTA AAATGATTGA ATTTTACAGT ATTTCTAAGA ATGGAArPGT 4740 

GOTATTTTTT TCTGTATTGA TTTTAACAGA AAATTTCAAT TTATAQAGGT TAGGAATTCC 4800 
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AA&CTACAGA AAATGTTTGT TTTTAGTGTC AAATTTTTAG CTGTATTTGT AGCAATTATC 4860 

AG8TTTGCTA GAAATATAAC TTTTAATACA GTAGCXTrGTA AATAAAACAC TCTTCCATAT 4920 

GATATTCAAC ATTTTACAAC TGCAQTATTC ACCTAAAGTA GAAATAATCT GTTACTTATT 49B0 

6TAAATACTG CCCTAGTGTC TCCATGGACC AAATTTATAT TTATAATTGT AOATTTTTAT 5040 

ATTTTACTAC TGAGTCAAGT TTTCTAGTTC TGTGTAATTG TTTAGTTTAA TGACGTAGTT 5100 

CATTAGCTGG TCTTACTCTA CCAOTTTTCT GACATTGTAT TGTGTTACCT AAOTCATTAA 5160 

CTTTGTTTCA GCATGTAATT TTAACTTTTG TGGAAAATAG AAATACCTTC ATTTTGAAAG 5220 

AA8TTTTTAT 6AGAATAACA CCTIAGCAAA CATTGTTCAA ATGGTTTTTA TCCAAGGAAT 52 BO 

TOCAAAAATA AATATAAATA TTGCCATTAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 5340 
AAA 

Seq ID NOt 577 Protein sequence: 
Protein Accession #t BOS sequence 

1 11 21 31 41 51 

I i . I I I I 

MRILKHFIiAC ZQLLCVCRLD HAHGYYRQQR KLVEEZGNSY TOAUIQKNNG KKYFTCNSPK 60 

QSPINIDEDL TQVNVNLKKL KPQ&tDKTSL ENTFXHNTGK TVBINLTNDY RVSG6VSEHV 120 

FKASKITFHW GKCNMSSDGS EBSIiEGQKFP LEMQIYCFDA ORFSSFEEAV KGK6KLRALS ISO 

ILFEVGTEEN LDFKAIIDGV ESVSRFGKQA AltDPFILLNL LPKSTDKYYI YNGSLTSFPC 240 

TDTVDWIVFK DTVSISESQL AVFCEVLTMQ QSGYVMLMDY LQNNFREQQY KFSRQVFSSY 300 

TGKEEIUEAV CSSEPENVQA DPENYTSLLV TWERPRWYD TMIEKFAVLY QQLD6EDQTR 360 

HEFLTDGYQD LGAIIiNNLLP NMSYVLQIVA ICTNGLYGKY SDQLZVDMPT I»IFEZJ>LFPB 420 

IiIGTEEIIKE EEE6KDIEBG AIVNPGRDSA TNQIRKKBPQ ISTTTHYNRI GTKYNEAKTN 460 

RSPTRGSEFS GKGDVPNTSL NSTSQPVTKL ATEKDISLTS QTVTELPPHT VEGTSASLND 540 

GSKTVLRSPH MNLSGTABSL HTVSITBYEE ESLLTSFKLD TGAEDSSGSS PATSAIPFIS 600 

ENISQGYIFS SENPETITYD VLIPESASNA SEDSTSSGSE ESLKDPSME6 NVWFPSSTDX 660 

TAQPDVGSGR ESPLQTNYTE IRVDESEKTT KSFSA6PVMS QGPSVTDLEM PHYSTFAYPP 720 

TEVTPHAFTP SSRQQDLV8T VNWYSQTTQ PVYNEASNSS HESRIGLAEG LESEKKAVIP 780 

LVZVSALTFI CLWLVGIIiZ YWRKCFQTAH FYLEDSTSPR VISTPPTPIF PISDDVGAIP 840 

IXHFPKBVAD LHASSGFTBE FEEVQSCTVD IX3ITADSSNH PDNKHKNRYI NIVAYDHSRV 900 

KLAQLAEKDG KLTDYINANY VDGVNRPKAY IAAQ6PLKST AEDPVfRMIWE HNVEVIVMIT 960 

NLVEKGRRKC DQYWPADGSE EYGNFLVTQK SVQVLAYYTV RNFTLRNTKI KKGSQKGRPS 1020 

GRWTQYHYT QWPDMGVPEY SLPVLTFVRK AAYAKRHAVG PVWHCSAGV GRTGTYIVLD 1080 

SNIiQQIOHEG TVKIFGFLXH IRSQRNYLVQ TEEQYVFIHD ThVBAXhSgS TEVIiDSHIBA 1140 

YVMALLIPGP AGKTKIiBKQF QLLSQSNXQQ SDYSAALKQC IIREKNRTSSI IPVERSRVQI 1200 

SSLSGBGTDY INASYIMGYY QSNEFIITQH PUaTIKDFW RMIHDHMAQL WMIPDGQim 1260 

AEDEFVYWPN KDEPINCESF KVTLMAEEHK CLSNEEKLII QDFILEATQD DYVLEVRHPQ 1320 

CPKHPNPDSP ISKTFELISV IKEBAANRDO PMZVBDEKGG VTAGTFCALT TLi4HQLEKBH 1380 

SVDVYQVAKM INLMRPGVFA DIBQYQFIiyK VZLSLVSTRQ BENPSTSLDS NGAALPDONI 1440 
AESLESIiV 



Seq ID NO: 57B DKA sequence 

Nucleic Acid Accessira #t BOS sequence 

Coding sequence: 501-4514 

1 11 21 31 41 51 

I I I I I I 

CftCACATAGS CAOGCAOSAT CTCACTTGGA TCTATACACT GGAG6ATTAA AACAAACAAA 60 

CAAAAAAAAC ATTTCCTTG6 CTCXXXXTTCC CTCTCCACTC TGA6AA6CA6 A6GA6CGGCA 120 

CGGCGAGGGG CCGCAGACCXS TCTGGAAATG C3QAATCCTAA AGCGTTTCCT CX3CTTGCATT 180 

CAGCTCCTCT GTGTTTGCC6 CCTGGATTGG GCTAATXSGAT ACTACAGACA AC3W5AGAAAA 240 

CTTGTTGAAG AGATTGGCT6 GTCCTATACA GGAGCACXGA ATCAAAAAAT TGGGGAAAGA 300 

AATATCCAAC ATGTAATA6C CXMUkACAAT CTOCTAtCAA TATTGATGAA GATCTTACAC 360 

AAGTAAATGT GAATCTTAA6 AAACTTAAAT TTCAG6GTT6 GGATAAAACA TCATTGGAAA 420 

ACACATTCAT TCATAACACT GGGAAAACAG TGGAAATTAA TCTCACTAAT GACTACOGTG 480 

TCAGCXS3AGG AGTTTCAGAA ATGGTGTTTA AAGCAAGC3iA GATAACTTTT CACTGGGGAA 540 

AATGCAATAT GTCATCTGAT GGATCAGAGC ATAGTTTAGA AGGACAAAAA TTTCCACTTG 600 

AGATGCAAAT CTACTGCTTT GATGCGGACC 6ATTTTCAAG TTTTGAGGAA GCAGTCAAAG 660 

GAAAAGGGAA GTTAAGAGCT TTATCCATTT TGTTTGAGGT TGGGACAQAA GAAAATTTGG 720 

ATTTCAAAGC GATTATTGAT GGAGTCGAAA GTGTTAGTCG TTTTGGGAAG CAGGCTGCTT 780 

TA6ATCCATT CATACTQTT6 AACCTTCTGC CAAACTCAAC TGACAAGTAT TACATTTACA 840 

ATGGCTCATT GACATCTCCT CCCTGCACAG ACACA6TTGA CTGQATTGTT TTTAAAGATA 900 

CA6TTAQCAT CTCTGAAAGC CAGTTGGCTG TTTTTTGTGA AGTTCTTACA ATGCAACAAT 960 

CTGGTTATGT CATGCTQATG GACTACTTAC AAAACAATTT TOGAGAGCAA CAGTACAAGT 1020 

TCTCTAGACA GGTGTTTTCC TCATACACTG 6AAA66AAGA OATTCATGAA GCAGTTTGTA 1080 

GTTCAQAACC AOAAAATGTT CAOGCTGACC CAGAQAATrA TACCAGCCTT CTTGTTACAT 1140 

G66AAA6ACC TCGAGTGGTT TATGATACCA TGATTGAGAA GTTTGCA6TT TTGTACCAGC 1200 

AGTTGGATGG AGAGQACCAA ACCAAGCATG AATTTTTGAC AGATGGCTAT CAAGACTTGG 1260 

GTGCTATTCT CAATAATTTG CTACCCAATA TGAGTTATGT TCTTCAGATA GTAGCCATAT 1320 

GCACTAATGG CTTATATGGA AAATACAGOG ACCAACT6AT TGTCGACAT6 CCTACTGATA 1380 

ATCCTGAACT TGATCTTTTC CCTGAATTAA TTGQAACTQA AGAAATAATC AAGGAGGAGG 1440 

AAGAGGGAAA AGACATTGAA GAAGGCGCTA TTGTGAATCC TGGTAGAGAC A6TGCTACAA 1500 

ACCAAATCAG GAAAAAGGAA CCCCAGATTT CTACCACAAC ACACTACAAT CGCATAGGGA 1560 

CGAAATACAA TGAAGCCAAG ACTAACCGAT CCCCAACAAG AGGAAGTGAA TTCTCTGGAA 1620 

AGGGTGATGT TCCCAATACA TCTTTAAATT CCACTTCCCA ACCAGTCACT AAATTAGCCA 1680 

CAGAAAAAGA TATTTCCTTO ACTTCTCAGA CIQTGACTGA ACTGGCACCT CACACIGTGG 1740 

AAGGTACTTC AGCCTCTTTA AATGATGQCT CCAAAACTGT TCTTAGATCT CCACATATGA 1800 

ACTTGTCGGG GACTCCAGAA TCCTTAAATA CAGTTTCTAT AACAGAATAT GAGGAGGAGA 1860 

GTTTATTQAC CAGTTTCAAG CTTGATACTG GAGCTGAAGA TTCTTCAGGC TCCAGTCCOS 1920 

CAACTTCTGC TATCCCATTC ATCTCTGAGA ACATATCCCA AGGGTATATA TTTTCCTCCG 1980 

AAAACCCAGA GACAATAACA lATGATGTOC TTATACCAGA ATCTGCTAGA AATGCTTCOG 2040 

AAGATTCAAC TTCATCAG6T TCAGAAGAAT CACTAAAG6A TCCTTCTATG GAGG6AAATG 2100 

TGTGGTTTCC TAGCTCTACA GACATAACAG CACAGCCCGA TGTTGGATCA GGCAGAGAGA 2160 

GCTTTCTCCA GACTAATTAC ACTGAGATAC GTGTTGATGA ATCTGAGAAG ACAACCAAGT 2220 

CCTTTTCTGC AGGCCCAGTG ATGTCACAGG GTCC3CTCRGT TACAGATCTG GAAATGCCAC 2280 



407 
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ATTATTCTAC CTTTGCCTAC TTOCCftACTG AQGTAACACC 
CChGACAACA GGATTTG6TC TCCACGGTCft AOOTGOZAXA 
TATACAATGA GGCCA6TAAT A0TA6CCATG AGTCTOBTAT 
AATCCGA6AA GAAGGCA6TT ATACCXXTTTG TQATCGTGTC 
TAGTGGTTCT TGTGGGTATT CTCATCTACT GGA(3GAAATG 
ACTTA6AGGA CAGTACATCC CCTA6AGTTA TATCCACACC 
TTTCAGATGA TGTGGGAGCA ATTCCAATAA AGCftCTTTCC 
ATGCAAOTAO TGQGTTTACT GAAGAATTTG A6ACACTGAA 
AGAGCTOTAC TGTTGACTTA GGTATTACAG CAGACAGCTC 
ACAAGAATCG ATACATAAAT ATOGTTGCCT ATGATCATAG 
TTGCTGAAAA 6GATGGCAAA CTGACTGATT ATATCAATGC 
ACAGACCAAA AGCTTATATT GCTQCOCMG GCCCACTGAA 
GGAGAATGAT AT6GGAACAT AATGTGGAA6 TTATTGTCAT 
AA6GAAGGAG AAAATGTGAT CAGTACTGGC CTGCCGATGG 
TTCTGGTCAC TCAGAAGAGT GTGCAAGTGC TTGCCTATTA 
TAA6AAACAC AAAAATAAAA AAGGGCTCCC AGAAAGGAA6 
CACAGTATCA CTACAC6CA6 TGGCCTGACA TGOGAGTACC 
TOACCTTTGr GAOAAAGGCA CCCTATGCCA AGCGCCATGC 
ACTGCAGTGC TGGAGTTGGA AGAACAGGCA CATATATTGT 
AGATTCAACA CGAAGGAACT GTCAACATAT TTGGCTTCTT 
6AAATTATTT GGTACAAACT GAGGAGCAAT ATGTCTTCAT 
CCATACTTAG TAAAGAAACT GAGGTGCTGG ACAGTCATAT 
TCCTCATTCC TGGACCAGCA GGCAAAACAA AGCTAGAGAA 
AQTCAAATAT ACAGCAGAGT GACTATTCTG CAGCOCTAAA 
ATCGAACTTC TTCTATCATC CCTGTGGAAA GATCAAGOaT 
GAGAAGGCAC AGACTACATC AATGCCTCCT ATATCATG60 
TCATCATTAC CCAGCACCCT CTCCTTCATA CCATCAAGGA 
ACCATAATGC CCAACTGGTG GTTATQATTC CTGATGGCCA 
TTGTTTACTG GCCAAATAAA GATGAGCCTA TAAATTGTGA 
TG6CTGAAGA AGACAAAT6T CTATCTAATG AOGAAAAACT 
TA8AA6CTAC ACAGGATGAT TATGTACTT6 AA6T6AG6CA 
CAAATCCAGA TAGCCCCATT AGTAAAACTT TTGAACTTAT 
CTGCCAATAG OGATGGQCCT ATGATTGTTC ATGATGAGCA 
CTTTCTGTGC TCTGACAAGC CTTATGCACC AACTA6AAAA 
ACCAGGTAGC CAAGKTGATC AATCTGATGA GG0CA6GAGT 
ATCAGTTTCT CTACAAAGTG ATCCTCABCC TTGTGAGCAC 
CCACCTCTCT GGAOUSTAAT GGTGCAGCAT TGCCTGATGG 
AGTCTTTAGT TTAACACAGA AAGGGGTGGG GGGACTCACA 
TCCTAAAATT AGGCA6GAAA ATCAGTCTAG TTCIGTTATC 
ACAGTAAC7T TCATQACATA OGATTCTGCC GCCAAATTTA 
TTTTTGCAA6 ACTTGTAATT TACTTATTAT GTTTGAACTA 
ATTTCTAAGA ATGGAATT6T GGTATTTTTT TCTGTATTGA 
TTATAGAGQT TAGQAATTCC AAACTACAGA AAATGTTTGT 
CTGTATTTGT AGCAATTATC AGGTTTGCTA GAAATATAAC 
AATAAAACAC TCTTCXATAT GATATTGAAC ATTTTACAAC 
GAAATAATCT OTTACTTATT GTAAATACTG CCCTAGTGTC 
TTATAATTGT A6ATTTTTAT ATTTTACTAC TGAGTC3VAGT 
TTTAGTTTAA TGAaSTAGTT CATTAGCTGG TCTTACTCTA 
TGTOTTAOCT AAGTCATTAA CTTTGTTTCA GCATGTAATT 
AAATACCITC ATTTTGAAAG AAGTTT7TAT GAGAATAACA 
ATGGTTTTTA TCCAAGGAAT TGCAAAAATA AATATAAATA 
AAAAAAAAAA AAAAAAAAAA AAA 



PCTAJS02/12476 



TCATGCTTTT 
CTOQ CAGACA 
TGGTCTAGCT 
AGCCCTGACT 
CTTCCAGACT 
TCCAACACCT 
AAAGCATGTT 
AGAGTTTTAC 
CAACCACCCA 
CAGGGTTAAO 
CAATTATGTT 
ATCCACAGCT 
GATAACAAAC 
GAGTGAGGAG 
TACTGTGAGG 
ACCCA(7rGGA 
AGAGTACTCC 
AGTGGGGCCr 
GCTAGACAGT 
AAAACACATC 
TCATGATACA 
TCATGCCTAT 
ACAATTCCAG 
GCAAT6CAAC 
TQGCATTTCA 
CTATTACCAG 
TTTCTGGAGG 
AAACATGGCA 
GAGCTTTAAG 
TATAATTCAG 
CTTTCAGTGT 
AAGTGTTATA 
TGGAGGAGTG 
AGAAAATTCC 
CTTTOCTGAC 
AAG6CAGGAA 
AAATATA6CT 
TCTQAGCATT 
TGTTGATTTC 
TATCATTAAC 
AAATOATTGA 
TTTTAACAGA 
TTTTAGTGTC 
TTTTAATACA 
TGCAGTATTC 
TCCATGGACC 
TTTCTAGTTC 
CCA6TTTTCT 
TTAACTTTTG 
CCTTAOCAAA 
TtGOCATTAA 



ACCCCATCCT 
ACCCAACOGG 
GAGG6GTTG6 
TTTATCTGTC 
GCACACTTTT 
ATCTTTCCAA 
GCAGATTTAC 
CAGGAAGTGC 
GACAACAAQC 
CTAQCACAGC 
GATGGCTACA 
GAAGATTTCT 
CTCGTGGAGA 
TACX3GGAACT 
AATTTTACTC 
OGTGTGGTCA 
CTGCCAQTGC 
GTTGTCGTCC 
AT6TTGCAGC 
CGTTCACAAA 
CTGGTTQAGG 
GTTAATGCAC 
CTCCTQAGCC 
AGGGAAAAGA 
TCCCTGAGT6 
A6CAATGAAT 
ATGATATGGG 
GAAGATGAAT 
GTCACTCTTA 
GACTTTATCT 
CCTAAAT6GC 

aaagaagaag 
acx;gca6gaa 

GTGGATQTTT 
ATTGAGCAGT 
GA6AATCCAT 
GAGAGCTTAG 
GTTTTCCTCT 
CCATGACCTG 
AATGTGTGCC 
A'i'lTTACAGT 
AAATTTCAAT 
AAATTTTTAG 
GTAGCCTGTA 
ACXTTAAAOTA 
AAATTTATAT 
TGTGTAATTG 
GACATTGTAT 
TOGAAAATAG 
CATT6TTCAA 
AAAAAAAAAA 



8eq ZD NOt 579 Protein sequence: 
Protein Accession 9 : BOS sequence 



MVFKASKITF 
LSIIiFEVGTE 
PCTDTVDWZV 
SyTGKEBIHB 
TtCHBFLTDGY 
PELIGTBEII 
TmSPTRGSE 
ND6SKTVLRS 
ISENISQGYI 
DITAQPDVGS 
PPTEVTPHAP 
IPI.VIVSALT 
IPIKHPPKHV 
IVAYDHSRVK 
NVEVIVMITN 
KGSOKGRPSG 
RTGTYIVLDS 
BVIiDSHZBAY 
PVERSRVGIS 
VMIPDGfflJMA 
YVLBVRHFQC 
LNRQLBXBKS 
OAALFDGNIA 



11 

I 

WKSKOmSSD 
ENLDFXAIID 
FKDTVSZSES 
AVCSSBPENV 
QDLGAIIiNNL 
KEEEEX3KDIE 
FSGKGDVPOT 
PHMNLSGTAE 
PSSEi^ETZT 
GRESFLQTNY 
TPSSRQQDLV 
FICLWLVOI 
ADLHASSGFT 
LAQLAEKDGK 
LVBKGRRKCD 
RWrOYHYTQ 
MLQQIQHEGT 
VKALLIFGPA 
SLSGBGTDYI 
EDSFVYWPNIC 
PKNPHPOSPZ 
VDVYQVAXMX 
ESLESIiV 



21 

I 

GSEHSLEGQK 
GVESVSRFGK 
QLAVPCEVLT 
QADPENYTSL 
liPNMSYVLQI 
EGAZVNPGRD 
SUJSTSQPVT 
SLNTVSITEY 
YDVLIFESAR 
TEIRVDESER 
STVNWVSQT 
LZYWRKCFQT 
EBPETIiKBFy 
LTOYZHANYV 
QYnPADGSBB 
WPDMGVPEYS 
VMIFGPLKHI 
GKTKLEKQFQ 
NASYIKGYYO 
DEPINCBSFK 
SKTFBLISVX 
NU1RPGVFAD 



31 

I 

FPLEMQIYCF 
QAALDPPZLL 
MQQSQYVMIJ4 
LVTWERPRW 
VAICTNGLYG 
SAINQIRKKE 
KLATEKDISL 
EEESIiLTSFK 
KASEDSTSSG 
TTKSFSAGFV 
TQPVYNBASN 
AHFYLEDSTS 
QEVQSCTVDL 
DGYNRPKAYl 
YGNFLVTQKS 
LPVLTFVRKA 
RSQRNYLVQT 
LLSQSNZQQS 
8NEFZZTQHP 
VTU1ABEBKC 
KEEAANRD6P 
lEQYQFLYKV 



41 

I 

OADRFSSFEE 
NLLPKSTDKY 
DYLQNNFREQ 
YDTOZEKPAV 
KYS DOLiyP M 
PQZSTTTHYN 
TSOTVTBLPP 
LDTGAEDS8G 
SBESLKDPSM 
MSQGPSVTDL 
SSHESRZ6LA 
PRVISTPPTP 
GITADSSNHP 
AAQGPIjKSTA 
VQVLAYYTVR 
AYAXRHAVGP 
EEQYVFZHDT 
DYSAALKQQ} 
LLHTIKDFWR 
LSNBEKLZIQ 
MIVEDERGGV 
ZLSIiVSTRQB 



51 

I 

AVKGXGKZiRA 
YZYNGSLTSP 
QYKFSRQVPS 
LYQQLDGEDQ 
PTDNPEJjDLF 
RZGTKYNEAK 
HTVEGTSASL 
SSPATSAIPP 
EGNVWPPSST 
EMPmrSTFAY 
E6LBSEKKAV 
ZFPISDDVGA 
DNKHKNRYIH 
EDFWRMZHEH 
HPTIiRNTRZK 
VWHCSAGVG 
LVBAZLSKET 
REKNRTSSZZ 
MZnCHNAQLV 
DFZLBATQDD 
TAGTFCALTT 
ENPSTSLDSH 



2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2620 
26d0 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3760 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4260 
4320 
4380 
4440 
4500 
4560 
4620 
4680 
4740 
4800 
4860 
4920 
4980 
5040 
5100 
5160 
5220 
5280 
5340 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 



Seq ZD NO: 580 ONA sequence 

Nucleic Acid Accession it EOS sequence 

Coding sequence: 146-4632 



11 



21 



31 



51 



408 
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I I I 

CACIICATAOG CACGCAOSftT CTCACTTG6A 
CAAAAAAAAC ATTTCCTTOG CTCCCCCTCC 
CQGCGAGGG6 CCGCA0ACCX3 TCTGGAAATG 
CAGCTCCTCT GTGTTTGCCG CCTGGATTGG 
CTTGTTGAA6 A6ATTGGCTG GTCCTATACA 
AAATATCCAA CATQTAATAG CCCAAAACAA 
CAAGTAAATG TGAATCTTAA GAAACTTAAA 
AACACATTCA TTCATAACAC TGC3GAAAACA 
GTCAGOGGAG GAGTTTCAGA AATGGTGTTT 
AAATGCAATA TGTCATCTGA TGGATCAGAG 
GAGATGCAAA TCTACT6CTT TQATGCGGAC 
6QAAAA6GGA AQTTAAGA6C TTTATCCATT 
GATTTCAAAG CGATTATTGA TGGAGTCOAA 
TTAGATCCAT TCATACTGTT GAACCTTCTG 
AATGGCTCAT TGACATCTCC TCCCTGCACA 
ACAGTTAGCA TCTCTGAAAO OCAGTTGGCT 
TCTGGTTATQ TCATGCTGAT GGACTACTTA 
TTCTCTAGAC AGGTGTTTTC CTCATACACT 
AGTTCAGAAC CAGAAAA7GT TCAGGCT6AC 
TGGOAAAGAC CTCGAGTOGT TTATGATACC 
CAGTTGGATG GAGAGGACCA AACCAAGCAT 
GGTGCTATTC TCWVTAATTT GCTACCCAAT 
TGCACTAATG GCTTATATGG AAAATACAGC 
AATCCT6AAC TTGATCTrrT CCCTGAATTA 
GAAGA6GGAA AAGACATTGA AGAAGGCGCT 
AACCAAATCA GQAAAAAGGA ACCCCAQATT 
ACGAAATACA ATGAAGCCAA GACTAACCGA 
AAGGGTGATG TTCCCAATAC ATCTTTAAAT 
ACAGAAAAAG ATATTTCCTT GACTTCTCAG 
GAAGGTACTT CAGCCTCTTT AAAT6ATGGC 
AACTTGTCGG GGACTGCAGA ATCCTTAAAT 
AGTTTATTQA CCAGTTTCAA GCTTGATACT 
QCAACTTCTG CTATCCCATT CATCTCTGA6 
GAAAACCGAG AGACAATAAC AtTATGATGTC 
GAAGATTCAA CTTCATCAGG TTCAGAAGAA 
GTGTGGTTTC CTAGCTCTAC AGACATAACA 
AGCTTTCTCC AGACTAATTA CACTGAGATA 
TCCTTTTCTG CAGGCCCAGT GATGTCACAG 
CATTATTCXA CCTTTGCCTA C7TCCCAACT 
TCCA6ACAAC AGGATTIGGT CTCCAGGGTC 
6TATACAATG AGGCCAGTAA TAGTAGCCAT 
GAATCCGAGA AGAAGQCAGT TATACCCCTT 
CTAGTGGTTC TTGT6GGTAT TCTCATCTAC 
TACTTAQAGG ACAGTACATC COCTAQAOTT 
ATTTCAGATO ATST0GGA6C AA7TCCAATA 
CATGCAAGTA GTOGGTTTAC TGAAGAATTT 
CAGAGCTGTA CTGTTGACTT AGGTATTACA 
CACAAGAATC GATACATAAA TATCXTTTGOC 
CTTGCIGAAA AGGAT6GCAA ACTGACTGAT 
AACAGAGCAA AA6CTTATAT TGCTGCXXAA 
TGGAGAATGA TATGGGAACA TAATGTGQAA 
AAAGGAAGGA GAAAATGTGA TCAGTACTGG 
TTTCTGGTCA CTCAGAAGAG TGTGCAAGIX3 
CTAAGAAACA CAAAAATAAA AAAGG6CTCC 
ACACAGTATC ACTACACGCA GTGGCCTGAC 
CTGACCTTTG TGAGAAAGGC AGCCTATGCC 
CACTGCAQTG CTGGAGTTGG AAGAACAGGC 
CAGATTCAAC ACGAAGGAAC TGTCAACATA 
AGAAATTATT TGGTACAAAC TGA6GAGCAA 
GCCATACTTA GTAAA6AAAC TGA6GTGCTG 
CTCCTCATTC CTGGACCaGC AGGCAAAACA 
CTGTCACCCA GGCTGGAGT6 CAGAGGCACA 
GGCTTAACTG ATCCTCCTAC CTCAGCCTCC 
TCAAATATAC AGCAQAGrGA CTATTCTGCA 
CGAACTTCTT CTATCATCCC TGTGGAAAGA 
QAAGGCACA6 ACTACATCAA TGCCTCCTAT 
ATCATTAC70C AOCACOCTCT CCTTCATAOC 
CATAATGCCC AACTGGTGGT TATGATTCCT 
GTTTACTGGC CAAATAAAGA TQAGCCTATA 
GCTGAAGAAC AGAAATXSTCT ATCTAATGAG 
GAAGCTACAC AGGATGATTA TGTACTTGAA 
AATCCAGATA 6CCCCATTAG TAAAACTTT7 
GCCAATAGGG ATGGGCCTAT GATTGTTCAT 
TTCTGTGCTC TGACAACCCT TATGCACCAA 
CAGGTAGCCA AGATQATCAA TCTGATGAQG 
CAGTTTCTCT ACAAAGTGAT CCTCAGCCTT 
ACCrCTCTGG ACAGTAATGG TQCSkGOlTTO 
TCTTTAOTTT AACACAGAAA GGGGTGGGGG 
CTAAAATTAG GCAGGAAAAT CAGTCTAGTT 
AGTAACTTTC ATGACATAGG ATTCTGCCGC 
TTT6CAAGAC TTGTAATTTA CTTATTATGT 
TTCKAAGAAT GGAATTGTGG TATTTTTTTC 
ATAGAGGTTA GGAATTCCAA ACTACAGAAA 
GTATTTGTAG CAATTATCA6 GTTTGCTAGA 
TAAAACACTC TTCCATATGA TATTCAACAT 
AATAATCTGT TACTTATTOT AAATACTGCC 



TCTATACACT GGAGGATTAA AACAAACAAA 60 

CTCTCCACTC TGAGAAGCAG AGGAGCG6CA 120 

CGAATCCTAA AAOGTTTCCT CGCTTGCATT 180 

GCTAATGGAT ACTACA6ACA ACAGAGAAAA 240 

GGAGCACTGA ATCAAAAAAA TTGGGQAAAG 300 

TCrCCTATCA ATATTGATGA AGATCTTACA 360 

TTTCAGGGTT GG6ATAAAAC ATCATT6GAA 420 

GTGGAAATTA ATCTCACTAA TGACTACCX3T 480 

AAA6CAAGCA AGATAACTTT TCACTGGGGA 540 

CATAGTTTAG AAGGACAAAA ATTTCCACTT 600 

CSOATTTTCAA GTTTTGAGGA AOCAGTCAAA 660 

TTGTTTGAG6 TTGGGACAGA AGAAAATTTG 720 

AGTGTTAGTC GTTTTGGGAA GCAGGCTGCT 780 

CCAAACTCAA CTGACAAGTA TTACATTTAC 840 

GACACAGTTG ACTGGATTGT TTTTAAAGAT 900 

GTTTTTTGTG AAGTTCTTAC AATGCAACAA 960 

CAAAACAATT TTC3GAGAGCA ACAGTACAAG 1020 

GGAAAGGAAG AGATTCATGA AGCAGTTTGT 1080 

CCAGAGAATT ATACCAGCCT TCTTGTTACA 1140 

ATGATTGAGA AGTTTGCAOT TTTGTACCAG 1200 

GAATTTTTGA CAGATGGCTA TCAAGACTTG 1260 

ATGAGTTATG TTCTTCA6AT AGTAGCCATA 1320 

GACCAACTGA TTGTCX3ACAT GCCTACTGAT 1380 

ATTGGAACTG AAGAAATAAT CAAGGAGGAG 1440 

ATTGTGAATC CTGGTAGAGA CAGTGCTACA 1500 

TCTACCACAA CACAGTACAA TCGCATAGGG 1560 

TCCCCAACAA GAGGAAGTGA ATTCTCTGGA 1620 

TCCACTTCCC AACXAGTCAC TAAATTAGCC 1680 

ACTGTGACTG AACTGCCACC TCACACTGTG 1740 

TCTAAAACTG TTCTTAGATC TCCACATATG 1800 

ACAGTTTCTA TAACAGAATA TGAGGAGGA6 1860 

GGAGCTGAAG ATTCTTCAGG CTCCAGTCCC 1920 

AACATATCCC AAGGGTATAT ATTTTCCTCC 1980 

CTTATAOCAO AATCTOCTAG AAATQCTTCC 2040 

TCACTAAAGO ATCCT T CTAT GGAGGGAAAT 2100 

GCACAGCCCG ATGTTGGATC AGGCAGAGAG 2160 

CGTGTTGATG AATCTGAGAA GACAACCAAG 2220 

GGTCCCTCAG TTACAGATCT GGAAATGCCA 2280 

GAG8TAACAC CTCATGCTTT TAGCGCATCC 2340 

AAG6TGGTAT ACTOGCAGAC AACCCAACCG 2400 

GAGTCTCGTA TTGGTCTAGC TGAGGGGTTG 2460 

GTGATCGTGT CAGCCCTGAC TTTTATCTGT 2520 

TG6AGGAAAT GCTTCCAGAC TGCACACTTT 2580 

ATATCCACAC CTCCAACACC TATCTtTCCA 2640 

AAQCACTTTC CAAAGCATGT TGCAQATTTA 2700 

GAGACACTGA AAGAGTTTTA CCAGGAAGTG 2760 

GCAGACA6CT CCAACCACCX: AGACAACAAG 2820 

TATGATCATA GCAGGGTTAA GCTAGCACAG 2880 

TATATCAATG CCAATIATGT TGAIQGCTAC 2940 

G6CCCACTGA AATCCACAGC T6AAGATTTC 3000 

GTTATTGTCA TGATAACAAA CCTCGTGGAG 3060 

CCTGCCGATG GGAGTGAGGA 6TACG6GAAC 3120 

CTTGCCTArr ATACTGIXSAG GAATTTTACT 3180 

CAGAAAGQAA GACCCAGTGO AG6TGTG8TC 3240 

ATGGGAGTAC CAGAGTACTC CXZTGCCAGTG 3300 

AAGOGCCATG CAGTGGGGCC TGTTGTCGTC 3360 

ACATATATTG TGCTAGACAG TATGTTGCAG 3420 

TTTGGCTTCT TAAAACACAT CCGTTCACAA 3480 

TATGTCTTCA TTCATGATAC ACTGGTTQAG 3540 

GACAGTCATA TTCATGCCTA TGTTAATGCA 3600 

AAGCTA6AGA AACAATTCCA GGGTCTCACT 3660 

ATCTCGGCTC ACTGCAACCT TCCTCTCCCT 3720 

CX3AGTG6CTG GGACTATACT CCTGAGCCAG 3780 

GCCXnrAAAGC AATGCAACAG GGAAAAGAAT 3840 

TCAAGGGTTG GCATTTCATC CCTGAGTGGA 3900 

ATCATGG6CT ATTACCAGAG CAATQAATTC 3960 

ATGAA8GATT TCTQSAGGAT GATATGGSAC 4020 

GATGGCCAAA ACATGGCAGA AGATGAATTT 4080 

AATTGTGAGA GCTTTAAGGT CACTCTTATG 4140 

GAAAAACTTA TAATTCAGGA CTTTATCTTA 4200 

GTGAGGCACT TTCAGTGTGC TAAATGGCCA 4260 

GAACTTATAA OTGTTATAAA AGAAGAAGCT '4320 

GATGAGCATG GAG6A6TGAC GGCAGGAACT 4380 

CTAGAAAAAG AAAATTCOGT GGATGTTTAC 4440 

CCAGGAGTCT TTGCTGACAT TGAGCAGTAT 4500 

GTGGGCACAA GGCAGGAAGA GAATCCATCC 4560 

CCTQATGGAA ATATAGCIOA GAOCrrASAO 4620 

GACrCACATC TGAGCa.TTGT ITTCL TO T C 4680 

CT G TT A TCTG TTGATTTCCC ATCACCTGAC 4740 

CAAATTTATA TCATTAACAA TGTGTGCCTT 4800 

TTGAACTAAA ATQATTGAAT TTTACAGTAT 4860 

TGIATTGATT TTAACAGAAA ATTTCAATTT 4920 

ATGTTTGTTT TTAGTGTCAA ATTTTTAGCT 4980 

AATATAACTT TTAATACAGT AGCCTGTAAA 5040 

TTTACAACTG CAGTATTCAC CTAAAGTAGA 5100 

CTAGTGTCTC CATXSGACCAA ATTTATATTT 5160 
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ATAATTGTAG ATTTTTATAT TTTACTACTG AOTCAAGTTT TCTAGTTCTG TGTAATTGTT 
TA6TTTAATG ACGTAGTTCI^ TTA6CTGGTC TTACTCTACC A6TTTTCIGA CATTGTATTO 
TGTTACCTAA GTCATTAACT TTGTTTCA6C ATGTAATTTT AACTTTTGTG GAAAATAQAA 
ATACCTTCAT TTTQAAAOAA GTTTTTATGA GAAXAACACX: TTACCAAACA TTGTTCftAAT 
QGTTTTTATC CAAGGAATTG CAAAAATAAA TATAAATATT GCCATTAAAA AAAAAAAAAA 
AAAAAAAAAA AAAAAAAAAA A 

Seq ID NO: 581 Protein sequence: 
Protein Accession #« BOS sequence 



PCT/US02/12476 



1 
I 

MRZLKRFLAC 
QSPINIDEDIj 
FKASKXTFHW 
ILFEVGTEEN 
TDTVDWIVFK 
TOKEEZHEAV 
HEFLTDGYQD 
LIGTEEIIKE 
RSPTRGSEF8 
GSKTVLRSFH 
ENISQ6YIFS 
TAQPDVGSGR 
TEVTPHAFTP 
LVIVSALTPI 
IKHFPKHVAO 
AYDHSRVKLA 
BVIVMITNLV 
SQKGRPSGRV 
GTYIVU3SML 
LDSHXEAYVH 
8RVA0TILL8 
YIMGYYQSNE 
INCESFKVTL 
FELISVIKEB 
RFGVFADIBQ 



11 
I 

IQLLCVCSIU) 
TQVNVNLKKL 
OKOmSSDGS 
LDFXAIIDGV 
DTVSZSESQL 
CSSEFENVQA 
LGAILNNLLP 
EEEGKDIEEG 
GKGDVPNTSIi 
NNLSGTABSL 
SEHPETITYD 
BSFLQTNYTE 
SSRQQDIiVST 
OiWIiVGILZ 
LRASSOFTEE 
QLAEKDGKLT 
EKSRRKCDQY 
VTQYHYTQWP 
QQIQHEGTVN 
ALLZPGPAGK 
QSNIQQSDYS 
PIITQHPLLH 
MAEEHKCXiSN 
AANSDGPMZV 
YQFLYKVILS 



21 
I 

HANGYYRQQR 
KFQ6HDKTSL 
EHSLEX3QKFP 
ESVSRFGKQA 
AVPCEVLTMQ 
DFBNYTSLLV 
KMSYVIiQIVA 
AIVNPGRDSA 
N8TSQPVTKL 
JUTVSITEYEB 
VLIPESARNA 
IRVDBSEKTT 
VNWYSQTTQ 
YWRKCPQTAH 
FETLKBPyQB 
DYINANYVDG 
WPADGSEEYG 
DMGVPEYSLP 
ZFGFIiKHIRS 
TKLEXQFQGL 
AAZiKQGNSEK 
TIKDFWRMIW 
EEKLZIQDFI 
HDEHGGVTAG 
LVGTRQEEKP 



31 
I 

KLVEEZ6HSY 
ENTFZHKTGK 
LEMQZYCPDA 
ALDPFZLUIL 
QSGYVMLMDY 
TWERPRWYD 
ICTNGLYQKY 
TKQIRKKEPQ 
ATBKDZSLTS 
BSLLTSFRLO 



41 
I 

TOALNQKNWG 
TVEZNLTNDY 
DRFSSFEEAV 
LPNSTDKYYZ 
LQNMFREQQY 
TMZEKFAVLY 
SDQLZVDMPT 
ZSTTTHYNRI 
QTVTELPPHT 



KSFSAGPVMS 
PVYNEASNSS 
PYLEDSTSPR 
VQSCTVIMU3Z 
YNRPKAYZAA 
NFLVTQKSVQ 
VLTFVRKAAY 
QBNYLVQTEE 
TLSPRIiECRG 
NRTSSZIPVE 
DHNAQLWMX 
LEATQDDYVI* 
TFCALTTLMH 
STSU3SNQAA 



QGPSVTDIiEM 
RESRIGLAEG 
VISTPPTPIF 
TADSSNHPDN 
QGPIiKSTAED 
VLAYYTVRNP 
AKRHAVGPW 
QYVPIHDTLV 
TZSAHCNLPL 
RSRVGZSSLS 
PDGQNMAEDE 
BVRHPQCPKM 
QLEKENSVDV 
LFDGNZAESL 



51 
I 

KKYPTOISPK 

RV8GGV5EMV 
KGKGKLRALS 
YKGSLTSPPC 
KFSROVFSSY 
QQLD6BDQTK 
DNFEIiDLPPE 
GTKmEAKTN 
VBGTSASUn) 
PATSAZPFIS 
NVHFPSSTDI 
PHYSTFAYFP 
LESEKKAVIP 
PISDDVQAZP 
KBKNRYZNZV 
FWRMZWEHNV 
TLRNTKIKXO 
VHCSAGVGRT 
EAZLSKETHV 
PGLTDPPTSA 
GBGTDYZNAS 
FVYWPNKDBP 
PNPDSPZSKT 
YQVAKNZNLM 
ESLV 



5220 
5280 
5340 
5400 
5460 



60 

120 
180 
240 
300 
360 
420 
460 
540 
600 
660 
720 
780 
840 

, 900 
960 

1020 

loao 

1140 
1200 
1260 
1320 
1380 
1440 



40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



8eq ID NO: 582 ONA sequence 

Nucleic Acid Accession #: NM_002851.1 

Coding sequence: 14 8.. 70 92 



1 

I 

CACACATACG 
CAAAAAAAAC 
0GG06AGGGG 
CA6CTCCTCT 
CTTGTTGAAG 
AAATATCCAA 
CAAGTAAATG 
AACACATTCA 
GTCAGCGGAG 
AAATGCAATA 
GAGATGCAAA 
GGAAAAGGGA 
GATTTCAAAG 
TTAGATCCAT 
AATGGCTCAT 
ACAGTTAGCA 
TCTGGTTATG 
TTCTCTAGAC 
AGTTCAGAAC 
TGOOAAAOAC 
CAGTTGQATG 
GGTGCTATTC 
TGCACTAATG 
AATCCTGAAC 
GAAGA6G6AA 
AACCAAATCA 
ACGAAATACA 
AAGGGTGATG 
ACA6AAAAAG 
GAAGGTACTT 
AACTTGTCGG 
AGTTTATTGA 
GCAACTTCTG 
GAAAAGCCAG 
GAA6ATTCAA 
G TGTGGTTTC 
AGCTTTCTCC 
TCCTTTTCTG 
CATTATTCTA 
TCCAGACAAC 
GTATACAATG 
ACCCCTTTGT 



11 

1 

CACGCACGAT 
ATTTCCTTOG 
CCGCAGACOG 
GTGTTTGCCG 
AGATTGGCTG 
CATGTAATAO 
TGAATCTTAA 
TTCATAACAC 
GAGTTTCAGA 
T8TGATCTGA 
TCTACTGCTT 
AGTTAAGAGC 
CGATTATTGA 
TCATACTGTT 
T6ACATCT0C 
TCTCT6AAA6 
TCATGCTQAT 
AGGTGTTTTC 
GAGAAAAT6T 
CTCQAGTC8T 
GAGAQOACCA 
TCAATAATTT 
GCTTATATGG 
TTGATCTTTT 
AAGACATTGA 
G6AAAAA6GA 
ATGAAGCCAA 
TTCCCAATAC 
ATATTTCCTT 
CAGCCTCTTT 
G6ACTGCAGA 
CCAGTTTCAA 
CTATCCCATT 
AGACAATAAC 
CTTCATCAGG 
CTAGCTCTAC 
AGACTAATTA 
CAGGCCCAGT 
CCTTTGCCTA 
AGGATTTGGT 
GTGAGACACC 
T6CTTGACAA 



21 

I 

CTCACTTCGA 
CTCCCCCTCC 
TCTGQAAATG 
CCTGGATTGG 
CTGCTATACA 
CCCAAAACAA 
GAAACTTAAA 
TGGGAAAACA 
AATGGTGTTT 
TCGATCA6AG 
T6AT6CX3GAC 
TTTATCCATT 
TGOA6TCXSAA 
GAACCTTCTG 
TCCCraCACA 
CCAGTTGGCT 
GGACTACTTA 
CTCATACACT 
TCAGOCTGAC 
TTATQATACC 
AACCAA6CAT 
GCTACXXAAT 
AAAATACAGC 
CCCTGAATTA 
AGAAG60GCT 
ACCCCA6ATT 
GACTAACG6A 
ATCTTTAAAT 
GACTTCTCAG 
AAATGATGGC 
ATOCTTAAAT 
6CTT6ATACT 
CATCTCTGAG 
ATATGATGTC 
TTCAGAAGAA 
AGACATAACA 
CACTGAGATA 
GATGTCACAG 
CTTCCCAACT 
CTCCACGGTC 
TCTTCAACCT 
TCAGATCCTC 



31 
I 

TCTATACACT 
CTCTCCACTC 
OGAATCCTAA 
GCTAATGGAT 
6GAGCACTGA 
TCrCCTATCSl 
TTTCAGGGTT 
GTGGAAATTA 
AAAGCAA6CA 
CATAOTTTAG 
CGATTTTCAA 
TTGTTTGA6Q 
AGTGTTAGTC 
CCAAACrCAA 
QACACAGTTG 
GTTTTTTGTG 
CAAAACAATT 
GGAAAGGAAG 
CCAGA6AATT 
ATOATTGAGA 
GAATTTTTGA 
ATGAGTTATG 
GACCAACTGA 
ATTGGAACTG 
ATTGT6AATC 
TCTAGCACAA 
TCCCXaACAA 
TCCACTTCCC 
ACTGTGACTG 
TCTAAAACTG 
ACAGTTTCTA 
GGAGCTGAAG 
AACATATCCC 
CTTATACCAG 
TCACTAAAGG 
GCACAGCCCG 
CGTGTTGATG 
G6TCCCTCAG 
GAGGTAACAC 
AAOGTGGTAT 
TCCTACAGTA 
AACACTACCC 



41 
I 

GGAGGATTAA 
TGAGAAGCAG 
AGOGTTTCCT 
ACTACAGACA 
ATOVAAAAAA 
ATATTGATGA 
GGGATAAAAC 
ATCTCACTAA 
AGATAACTTT 
AAGGACAAAA 
GTmGAGGA 
TTGGQACAGA 
GTTTTGGGAA 
CTGACAA6TA 
ACT66ATT0T 
AAGTTCtTAC 
TTCGAGAGCA 
AGA7TCATGA 
ATACCAGCCT 
AGTTTGCAGT 
CAGATGGCTA 
TTCTTCAGAT 
TTGTOGACAT 
AAGAAATAAT 
CTOGTAGAGA 
GACACTACAA 
GAGGAAGTGA 
AACCAGTCAC 
AACTGCCACC 
TTCTTAGATC 
TAACAGAATA 
ATTCTTCAGG 
AAGGGTATAT 
AATCTGCTAO 
ATCCTTCTAT 
ATGTTGGATC 
AATCTGAGAA 
TTACA6ATCT 
CTCATGCTTT 
ACTCGCAGAC 
GT6AAGTCTT 
CTGCTGCTTC 



51 
I 

AACAAACAAA 

AGGAGCCGCA 
CSCTTGCArr 
ACAGAGAAAA 
TTG66GAAA6 
AQATCTTACA 
ATCATTGGAA 
TGACTACCGT 
TCACTGGGGA 
ATTTCCACTT 
AGCAGTCAAA 
AGAAAATTTG 
GCAGGCTGCT 
TTACATTTAC 
TTTTAAAGAT 
AATGCAACAA 
ACAGTACAAG 
A6CAGTTTGT 
TCTTGTTACA 
TTTGTACCAO 
TCAAGACTTO 
AGTAGCCATA 
GCCTACTGAT 
CAAGGAGGAG 
CAGTGCTACA 
TCGCATA6GG 
ATTCTCTGGA 
TAAATTAGCC 
TCACACTGTG 
TCCACATATG 
TQAGGAGGAG 

cTCcaGTcxx: 

ATTTTCCTCC 
AAATGCTTCC 
G6AGGGAAAT 
AGGCAGAGAG 
GACAACCAAG 
GGAAATGCCA 

TACCoavrcc 

AACCCAACCG 
TCCTCTAGTC 
AAGTAGTGAT 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
16B0 
1740 
1800 
1860 
1920 
1960 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
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TC3GGCCTTGC ATGCTACGCC TOTATTTCCC AtSTGTCGATQ TGTCATTTGA ATCCATCCTG 2580 
TCTTCCTATG ATGGTGCACC TTTGCTTCC3V TTTTCCTCTG CTTOCTTCAG TAGXGAATT6 2640 
TTTCGCCATC TGCATACAGT TTCTCAAATC CTTCCACAAG TTACTTCAGC TACC3GAGAGT 2700 
GATAA68TCC CCTTOCATOC TTCTCTGCCA QTQGCTGGGG GTGATTTGCT ATTAGAGCCC 2760 
5 AGCCTTGCTC A6TATTCTGA TGT6CTGTCC ACTACTCATG CTGCTTCAGA GACGCTGGAA 2820 
TTTGGTAGTG AATCTGGTGT TCTTTATAAA ACGCTTATGT TTTCTCAAGT TGAACCACCC 2880 
AGCAGTGATG CCATGATGCA TGCACGTTCT TCAGGGCCTG AACCTTCTTA TGCCTTGTCT 2940 
QATAATGAGG GCTCCCAACA CATCTTCACT GTTTCTTACA GTTCTGCAAT ACCTGTGCAT 3000 
QATTCTGTGO OTGTAACTTA TC3W3QGTTCC TTATTTAGCG GCCCTAGCCA TATACX^ATA 3060 
10 CCTAAGTCTT CGTTAATAAC CCCAACTGCA TCATTACTGC AGCXTTACTCA TGCCCTCTCT 3120 
GGTQATGGGG AATGGTCTGG AGCCTCTTCT 6ATAGTGAAT TTCTT TTAC C TGACACA6AT 3180 
GQGCTGACAG CCCTTAACAT TTCTTCACCT GTTTCTOTAG CTOAATTTAC ATATAC3ACA 3240 
TCTGTGTTTG GTGATGATAA TAAGGCX5CTT TCTAAAA6TG AAATAATATA TGQAAATGAG 3300 
ACTGAACTGC AAATTCCTTC TTTC7UVTGAG ATGGTTTACC CTTCTGAAAG CACAGTCATG 3360 
15 CCCRACATGT ATGATAATGT AAATAAGTTQ AATGCGTCTT TACAAGAAAC CTCTGTTTCC 3420 
ATTTCTAGCA CCAAGGGCAT 6TTTCCAGGG TCCCTTGCTC ATACCACCAC TAAGGTTTTT 3480 
GATCATGAGA TTAGTCAAGT TCCaGAAAAT AACTTTTCAQ TTCAAOCTAC ACATACTGTC 3540 
TCTCAAGCAT CTGGTGACAC TTCGCTTAAA CCTGTGCTTA GTGCAAACTC AGAGC CAGCA 3600 
TCCTCTGACC CTGCTTCTAG TQAAATGTTA TCTCCTTCAA CTCAGCTCTT ATTTTATGAG 3660 
20 ACCTCAGCTT CTTTTAGTAC TGAAGTATTG CTACAACCTT CCTTTCAGGC TTCTGATGTT 3720 
GACACCTTGC TTAAAACTGT TCTTCCAOCT GTOCCCAGTG ATCCAATATT GGTTGAAACC 3780 
CCCAAAGrro ATAAAATTAO TTCTACAATG TTGCATCTCA TTGTATCAAA TTCTOCTTCA 3840 
AOTGAAAACA TGCTQCACTC TACATCTGTA CX»GTTTTTG ATOTGTCQCC TACTTCTCAT 3900 
ATGCACTCTG CTTCACTTCA AGGTTTGACC ATTTCCTATG CAAGTGAGAA ATATQAACCA 3960 
25 GTTTTGTTAA AAAGTGAAAG TTCCCACCAA GTGGTACCTT CTTTGTACAG TAATGATGAG 4020 
TTGTTCCAAA CGGCCAATTT GGAGATTAAC CAGGCCCATC CCCCAAAAGG AAGGCATGTA 4080 
TTTCCTACAC CrGTTTTATC AATTGATOAA OCATTAAATA CACTAATAAA TAAGCTTATA 4140 
CftTTCOGATG AAATTTTAAC CTCCACOVAA AGTTCTGTTA CTGGTAAGGT ATTTGCTGGT 4200 
ATTCCAACAG TTGCTTCTGA TACATTTGTA TCTACTGATC ATTCTGTTCC TATAGGAAAT 4260 
30 GGGCATGTTG CXATTACAGC TQTTTCTCCC CACAGAGATG GTTCTGTAAC CTCAACAAAQ 4320 
TTGCTGTTTC CTTCTAAGGC AACTTCTGAG CTGAGTCATA GTGCCAAATC TGATGCCGGT 4380 
TTAOTGGGTG GTOOTGAAGA TGaTGACACT GAXGATGATG GTGATGATGA TGATGACAGA 4440 
GATAGTCATG GCTTATCCAT TCATAAGTGT ATGTCATGCT CATCCTATAG AGAATCACAG 4500 
GAAAAGGTAA TGAATGATTC AGACACCCAC GAAAACAGTC TTATGGATCA GAATAATCCA 4560 
35 ATCTCATACT CACTATCTGA GAATTCTGAA GAAGATAATA GAGTCACAAG TOTATCCTCA 4620 
QACAGTCSiAA CTGGTATGGA CAGAAGTCCT GGTAAATCAC CATCAGCAAA TG6GCTATCC 4680 
CAAAAOCACA ATGATGGAAA AGAGGAAAAT GACATTCAGA CTGGTAGTGC TCTGCTTCCT 4740 
CTCAOCCCTG AATCTAAAGC ATGGGCAOTT CTGACAAGTG ATGAAGAAAG TGGATCAGGG 4800 
CAAGGTACCT CAGATAGCCT TAATGAGAAT GAGACTTCCA CAGATTTCAG TTTTGCAGAC 4860 
40 ACTAATGAAA AAGATGCTGA TGGGATCCTG GCAGCAGGTG ACTCROAAftT AACTCCTGGA 4920 
TTCCCACAGT CCCCAACATC ATCTGTTACT AGCQAGAACT CAQAAGTGTT CCACGTTTCA 4980 
QAOGCAGAGG CCAGTAATAO TAGCCATGAG TCTCGTATTQ 6TCTA6CTGA GGGGTTG6AA 5040 
TCCGAGAAOA AG6CA0TTAT ACCCCTTGTG ATC6TGTCAG CCCTGACTTT TAT CTGTC TA 5100 
GTGGTTCTTC TGGGTATTCT CATCTACTGO AGGAAATGCT TCCRGACTGC ACAC TTTTAC 5160 
45 TTAGAGGACA GTACATCCCC TAGAGTTATA TCCACACCTC CAACACCTAT CTTTCCAATT 5220 
TCAGATGATC TOGGAGCAAT TCCAATAAAG CACTTTCCAA AGCATGTTGC AOATTTACAT 5280 
GCAAGTAGTG GGTTTACTGA AGAATTTGAG ACACTGAAAG AGTTTTACCA GGAAGTGC3W3 5340 
AGCTGTACTO TTGACTTAGG TATTACAGCA GACAGCTCCA ACCACCCAGA CAACAAGCAC 5400 
AAGAATCQAT ACATAAATAT CGTTGCCTAT GATCATAGCA GGGTTAAGCT AGCACAGCTT 5460 
50 GCTGAAAAGG ATGGCAAACT GACTGATTAT ATCAATGCCA ATTaTGTTGA TOOCTACAAC 5520 
AGACCAAAAG CTTATATTGC TGCCX3WU3GC CCACTGAAAT CX^CAGCTGA AOATTTCrGG 5S80 
AGAATQATAT G6GAACATAA TGTGQAAGTT ATTGTCAT6A TAACAAACCT CGTGGAGAAA S640 
GGAAGGAGAA AATGTGATCA GTACTGOCCT GCCGATGGGA GTGAGGAGTA CGGGAACTTT 5700 
CTGGTCACTC AGAAGAGTGT GCAAGTGCTT GOCTATTATA CTGTGAG6AA TTTTACTCTA 5760 
55 AGAAACACAA AAATAAAAAA GGGCTCCCaO AAAGOAAQAC CCAGTGOACQ TGTGGTCACA 5820 
CAGTATCACT ACAOGCAGTO GCCTGACATG GGAGTACCAG AGTACTCCCT GCCAGTGCTG 5880 
ACCTTTGTGA GAAAGGCAGC CTATGCCAAG CGCCATGCAG TGGGGCCTGT TGTC GTCCAC 5940 
TGCAGTCGTG QAGTTGGAAG AACAGGCACA TATATTGTGC TAGACAGTAT GTTGCAGCAG 6000 
ATTCAACAOG AAGQAACTGT C3UVCATATTT GGCTTCTTAA AAC3«»TCC6 TTCACAAAGA 6060 
60 AATTATTtGG TACAAACTGA GGA6CAATAT GTCTTCATTC ATOATACACT GGTTGAGGCC 6120 
ATACTTAOTA AAGAAACTGA GGTGCTGGAC AGTCATATTC ATGCCTATGT TAATGCACTC 6180 
CTCATTCCTG GACCAGCAGG CAAAACAAAG CTAGAGAAAC AATTCCAGCT CCTGAGCCAG 6240 
TCAAATATAC AGCAGAGTGA CTATTCTGCA GCCCTAAAGC AATGCAACAG GGAAAAGAAT 6300 
CGAACTTCTT CTATCATCCC TGTGGAAAGA TCAAGGGTTG GCATTTCATC CCTGAGT6GA 6360 
65 QAAGGCACAG ACTACATCAA TGCCTOCTAT ATCATGGGCT ATTACCAGAG CAATGAATTC 6420 
ATCATTACCC AOCACCCTCT CCTTCATACC ATCAAG6ATT TCTGGAGGAT GATATGGGAC 6480 
CATAATGCCC AACTGGTGGT TATGATTCCT GAT6GCCAAA A CATGG CAGA A6ATQAATTT 6540 
OXTTACTGGC CAAATAAAGA TGAGCXTTATA AATTGTGAGA GCTTTAAGGT C31CT CTTATG 6600 
GCTGAAGAAC ACAAATGTCT ATCTAATGAG GAAAAACTTA TAATTCAGGA CTTTATCTTA 6660 
70 GAAGCTACAC AGGATOATTA TGTACTTGAA GTQAGGCACT TTCAGTGTCC TAAATGGCCA 6720 
AATOCAGATA GCOCCATTAG TAAAACTTTT GAACTTATAA GTGTTATAAA AGAAGAAGCT 6780 
GCCAATA6GG ATOGQCCTAT GATTGTTCAT GATGAGCATG GAGGAGTGAC GG CAGGAA CT 6840 
TTCTCTGCTC TGACAACCCT TATGCACCAA CTAGAAAAAG AAAATTCCOT GOATOTTTAC 6900 
CAGGTAGCCA AGATGATCAA TCTGATGAGG CCAGGAGTCT TTGCTQACAT TGAGCAOTAT 6960 
75 CAGTTTCTCT ACAAAGTGAT CCTCAGCCTT GTGAGCACAA G6C3U3GAAQA GAATCCATCC 7020 
ACCTCTCTGG ACAGTAATGG TGCAGCATTG CCTGATGQAA ATATAGCTGA GAGCTTAGAG 7080 
TCTTTAGTTT AACACAGAAA GGG6TGGGGG GACTCACATC TGAGCATTGT TTTCCTCTTC 7140 
CTAAAATTAG GCAGGAAAAT CAGTCTAGTT CTQTTATCTO TTGATTTCOC ATCACCTGAC 7200 
AGTAACTTTC ATGACATAGG ATTCTGCCGC CAAATTTATA TCATTAACAA TGTGTGCCrP 7260 
80 TTTGCAAGAC TTGTAATTTA CTTATTATGT TTGAACTAAA ATGATTGAAT TTTACAGTAT 7320 
rrCTAAGAAT GGAATTGTGG TATTTTTTTC TGTATTGATT TTAACAGAAA ATTTCAATTT 7380 
ATAGAGGTTA GGAATTCCAA ACTACAGAAA ATGTTTGTTT TTAGTGTCAA ATTTTTAGCT 7440 
GTATTTGTAG CAATTATCAG GTTTGCTAGA AATATAACTT TTAATACAflT AGCCTGTAAA 7500 
TAAAACACTC TTCCATATGA TATTCAACAT TTTACAACTG CAGTATTCAC CTAAAGTAGA 7560 
85 AATAATCTGT TACTTATTGT AAATACTGCC CTAGTGTCTC CATGGACCAA ATTTATATTT 7620 
ATAATTGTAG ATTTTTATAT TTTACTACTG AGTCAAGTTT TCTAGTTCTG TGTAATTGTT 7680 
TA6TTTAATG ACGTAGTTCA TTAGCTGGTC TTACTCTACC AGTTTTCTGA CATTGTATTG 7740 
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TGTTACCTAA GTCATTAACT TTGTTTCAGC ATGTAATTTT AACTTTTGTG GAAAATAGAA 7800 

ATACCTTCAT TTTQAAAOAA GTTTTTATGA GAATAACAOC TTACCAAACA TTGTTCAAAT 7860 

QGTTTTTATC OUUmATTG CAAAAATAAA TATAAATATT 6CX31TTAAAA AAAAAAAAAA 7920 
AAAAAAAAAA AAAAAAAAAA A 

Seq ID NO: 583 Protein sequence 
Protein Acceeaion 1IP_002 842.1 

1 ai 21 31 41 51 

I i I I I I 

MRILKRFLAC IQLLCVCRLD WANOYVkQQR KLVBEIOWSY TGAIJIQKNWG KKYPTCNSPK 60 

QSPINIDEDL TQVNVNLKKL KFQSWDKTSL ENTPIHNTGK TVEINLTUDY RVSGGVSEMV 120 

PKASKITFHW GKCNMSSOGS EHSLB6QKFP LEMQIVCFDA DRFSSFEEAV KGKGKLRALS 180 

ILPBVGTEEN LDPXAIIDGV ESVSRFGKOA ALDPPILLNL LPHSTDKYYI YNGSLTSPPC 240 

TDTVDWIVPK DTVSISESQL AVFCBVLTMQ QSGYVMLMDY LQNNFRBQQY KPSRQVFS8Y 300 

TCKEEIHEAV CSSEPENVQA DPENVTSLLV TWERPRWYD TMIEKPAVLY QQLDGEDQTK 360 

HEFLTDGYQD LGAILNNLLP NMSYVLQIVA ICTNGLY6KY SDQLIVDMPT DNPBI*DIiPPB 420 

LIOTEEIIKB EEEGKDIBBG AIVNPGRDSA TNQIRKKEPQ ISTTTHYNRl GTKYMEAKTM 480 

RSPTRGSEPS GMQDVPNTSL NSTSQPVTKL ATEKDISLTS QTVTELpPHT VEGTSASUID 540 

GSKTVLRSPH KHLSGTAESL NTV8ITEYBB BSLLTSFKLD TOABDSSOSS PATSAZPFIS 600 

ENISQGYIPS SBNPETITYD VLIPESARNA SEDSTSSOSB ESLKDPSMEG NVWFPSSTDI 660 

TAQPDVGSGR BSFLQTNYTE IRVDESEKTT KSPSAGPVMS QOPSVTDIjEM PHYSTFAYPP 720 

TEVTPHAFTP SSRQQDLVST VNWYSQTTQ PVTfNGETPLQ PSYSSEVFPL VTPLLLDNQI 780 

LNTTPAASSS DSALBATPVF PSVDVSFBSI LSSYDGAFLL PFSSA SPSSB LFRHLHTVSQ 840 

ILPQVTSATB SDKVPLHASL PVAGGDLXjLB PSIiAOYSDVL STTBAASETXf EFGSBSOVLY 900 

KTLMFSQVEP PSSDAMMHAR SSGPEPSYAL SDSBQSQHIP TVSYSSAIPV HDSVGVTYQO 960 

SLFSGPSHIP IPKSSLITPT ASLLQPTHAL SGDGBWSGAS SDSEFLLPDT DGLTALNISS 1020 

PVSVABPTYT TSVFGDDNKA I^SKSBIIYtaJ BTELOIPSFN BMVYPSESTV MPNMYDNVNK 1080 

IiNASLQETSV SISSTKGNFP GSLAHTTTKV FDBBISQVPE NNFSVQPTHT VSQASGDTSL 1140 

KPVLSANSEP ASSDFASSEK LSPSTQLLFY ETSASPSTEV LLQP8FQASD VDTLLKTVLP 1200 

AVPSDPILVB TPKVDKISST MLHLZVSNSA SSENMLESTS VPVPDV8PTS HMHSASLQGL 1260 

TI8YASEKYB PVLLKSESSH OVVPSLYSND ELPQTANLEI NQTffiPPKGRH VPATPVL3ID 1320 

EPLNTLINKL IHSDEILTST KSSVTGKVFA GIPTVASDTP VSTDHSVPIG KGHVAITAVS 1380 

PHRDGSVTST KLLFPSKATS ELSHSAKSOA GLVGGGEDGD TDDDGDDDDD RDSDGLSZKX 1440 

CMSCSSYRBS QBKVMNSSDT HBNSLNDQMN PZSYSLSENS EBDHRVTSVS SDSQTtSflDRS 1500 

PGKSPSANGL SQKHiaXSKBE NDIQTGSALL PLSPSSXKNA VLT8DBB8GS GQGTSDSIiKE 1560 

NHTSTDPSFA DTNEKDADGI LAAGDSEITP QFPQSPTSSV TSEMSBVFHV SBAEASNS6H 1620 

ESRIGLABGL ESEKKAVIPL VIVSALTPIC LWLVGILZY HRKCFQTAHF YLEDSTSPRV 1680 

ISTPPTPIPP ISDDVGAIPI KHFPKHVADL HASSGFTEEF ETLKBFYQEV QSCTVDLGIT 1740 

ADSSNHPDNK BKKRYZKIVA YDBSRVKIAQ LASiaXIKLTD YINAHYVDGY NRPXAYIAAQ 1800 

GPIiXSTAEDF HRMZWEmiVE VIVKZTHLVE KGRRKCDQYH PADQSEEYGN FLVTQKSVQV 1860 

LAYYTVRNFT LRKTKIKKGS QKGRPSGRW TQYHYTQWPD MGVPBYSLPV LTPVRKAAYA 1920 

KRHAVGPVW HCSAGVGRTG TYZVLDSMLQ QZQHEGTVNI FGFLKHIRSQ RNYLVQTEEQ 1980 

YVFZHDTLVE AIIiSKETEVL DSHIHAYVNA ItLIPGPAGKT KLEKQFQIjLS QSHIQQSDYS 2040 

AALKQOIREK NRTSSIZFVB RSRVGZ6SL8 GBGHnTZNAS YIKOYYOSHE FI ITQHP LLH 2100 

TIKDFWRMZN DHNAQLWMI PDGQNMAEDE FVYHFNKDSP ZHCBSFKVTL KABEHKCLSN 2160 

ESiC£.ZIQDFI L6ATQDDYVL EVRHFQCPKW PNPDSPISKT FEZiZSVZXEB AANXUXSPMIV 2220 

HDEHGGVTAG TFCALTTLMH QLEXEHSVDV YQVAXMINLM RPGVFADZEQ YQFLYKVILS 2280 
LVSTRQEEKP ST8LDSNGAA LPDOIIAESL ESLV 

Seq 10 MO I 584 DNA sequence 

Nucleic Acid AceeBsicn |s |i«_O0S6B6.1 

Coding sequence: 126.. 443 9 

I 11 21 31 41 51 

I 1 i 1 I i 

COGGGCAGGT QGCTCATGCT CGGGAGCGTG GTTGAGOGGC TQGaSOQQTT GTCCTGGAGC 60 

AGGGGCGCAG QAATTCTGAT GTGAAACTAA CAGTCTGTGA GCCCTGGAAC CTC06CTCAS 120 

AGAAGATGAA GGATATCGAC ATAGGAAAAG AGTATATCAT CCCCAGTCCT GG6TATAGAA 180 

GTGTGAGGGA GAGAACCAGC ACTTCTGGGA CGCACAQAGA CC6TGAAGAT TCCAAGTTCA 240 

GGAGAACTCG ACCGTTGGAA TGCCAAOATG CCTTGOAAAC AGCAGCCOGA GCCGAGGGCC 300 

TCTCrCTTGA TGCCTCCATG CATTCTCAGC TCAGAATCCT GGATGAG6A6 CATCCCAAGG 360 

GAAAGTACCA TCATGGCTTO AGTGCTCTGA A6CCCATCCG GACTACTTCC AAACACCAGC 420 

AOCCAGTGGA CAATGCTOGG W r m ' C CT GTATGACTTT TTCGTGGCTT TCTTCTCTGG 480 

CCCGTGTGGC CCACAAQAAQ GGGQAGCTCT CAATGGAAGA CGTGTGGTCT CTGTCCAAGC 540 

AOGAOTCTTC TGACGTGAAC TGCAGAAGAC TAGAGAGACT GTGGCAAGAA GAGCTGAATG 600 

AAGTTGG6CC AGACGCTGCT TGCCTGCGAA GGGTTGTGTG GATCTTCT6C OGCACCAGGC 660 

TCATCCTGTC CATCG7GTGC C7SA7X3ATCA CXjCAGCTGGC TG6CTTCAGT GGACCAGCCT 720 

TCaXGGTGAA ACACCTCTTG GAGTATAOCC AGGCAACAOA GTCTAACCTO CA6TACAGCT 780 

TGTTGTTAGT GCTGG6CCTC CTCXTItSACXSG AAATOQTGOG GTCTTGOTCG CTTGCACTGA 840 

CTTGGGCATT GAATTACCGA ACCGGTGTCC GCTTGCGGGG GGCCATCCTA ACCATGGCAT 900 

TTAAGAAGAT CCrTAAGTTA AAGAACATTA AAOAGAAATC CCTGGGTGAG CTCATCAACA 960 

TTTOCTCCAA GGATGG6CAG AGAATGTTTO AOGCAGCAGC CGTTGGCAGC CTGCTQ6CT0 1020 

GA6GACCCX3T TGTTGCCATC TTAGGCATGA TTTATAATGT AATTATTCTG 6GA0CAACA6 1080 

GCrrCCTGGG ATCAGCTOTT TTTATCCTCT TTTACCCAGC AATGATGTTT 6CATCACG6C 1140 

TCACAGCATA TTTCAGGAGA AAATQCX3TGG COGCCAOGGA TGAAOGTGTC CA GRAGAT GA 1200 

AT6AAGTTCT TACTTACATT AAATTTATCA AAATGTAT6C CTGGGTCAAA GCATTTTCTC 1260 

AOASTGTTCA AAAAATCX:GC GAGGAGGAOC GTOGGATATT QQAAAAAQQC GGGT ACTT OC 1320 

AG60TATCAC TGTG6QTGTG GCTCCCATTG TGGTGGTOAT TQCCAGGSTQ OTGACCTTCT 1380 

CTGTTCATAT GACCCTGGGC TTCQATCTGA CAGCaGCACA GOCTTTCACA GTSGTGACAG 1440 

TCTTCAATTC CATGACTTTT GCTTTGAAA6 TAACACCGTT TTCAGTAAAG TCCCTCTCAG 1500 

AA6CCTCAGT GGCTGTTGAC AGATTTAAGA GTTTOTTTCT AATGGAAGAG GTTCACATGA 1560 

TAAAGAACAA ACCA6CCAGT CCTCACATCA AOATAGAGAT GAAAAATGCC ACCTTGGCAT 1620 

GGGACTCCTC CCACTCCAGT ATCCAGAACT CQCCCAAGCT GAOCCCCAAA ATCAAAAAAG 1680 

ACAAGAGQQC TTCCAGGGGC AAGAAAGAGA AGGTGAGGCA GCTGCAGCGC ACTGAGCATC 1740 

AGGCGGTGCT GGCAQAGCAG AAAGGCCACC TCCTCCTGGA CAGTGACGAG OGGCCCAGTC 1800 

COGAASAOQA A6AA6GCAAG CACATOCACC TQ6GCCACCT G08CTTACAG AGGACACTGC 1660 
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ACAGCATCGA TCTGGAGATC CAAGAGGGTA AACTGGTTGG AATCTGOGGC AGTGTGGGAA 1920 

GTGGAAAAAC CTCTCTCATT TCaGCCATTT TAGGCCAQAT OAaSCTTCIA (AGGOCAQCA 1980 

TTGCAATCAG TGC3AACCTTC GCTTATGTGG CCCA6CAG6C CTGGATCCTC AATOCTACTC 2040 

TGAGAQACAA CATCCTGTTT GGGAAGGAAT ATGATGAAGA AAGATAOAC TCTGTGCTGA 2100 

ACAGCTGCTG CCTGAGGCCT GACCTGGCCA TTCTTCCCAG CAGC3GACCTG ACGGAGATTG 2160 

GAGAGCGAQQ AGCCAACCTG AGCGOTGGGC AGCGCCAGAG GATCAGCCTT GCC0GG6CCT 2220 

TGTATAGTGA CAGGAGCATC TACATCCTGG AG6ACCGCCT CAGTGCCTTA GATGCCCATG 2280 

TGGGCAACCA CATCTTCAAT AGTGCTATCC GGAAACATCT CAAGTCCAAO ACAGTTCT6T 2340 

TTGTTACCCA (XAGTTACAG TACCTGGTTG ACTGTGATQA AGTGATCTTC ATGAAAGAGG 2400 

GCTGTATTAC GGAAAGAGGC ACXXATGAGG AACTQATGAA TTTAAATGGT GACTATGCTA 2460 

CCATTTTTAA TAACCTGTTQ CrGGGAGAOA CACOGCCAQT TGftGATCAAT TCAAAAAAGG 2520 

AAACCAGTG6 TTCACAGAAG AAGTCACAA6 ACAAGGGTCC TAAAACAGGA TCA6TAAA6A 2S80 

AGGAAAAAGC AGTAAAGCCA GAGGAA6GGC AGCTT6TGCA GCTG6AA6A5 AAAGG6CAG6 2640 

GTTCAGTGCC CTQGTCAGTA TATGOTGTCT ACATCCAGGC TGCTGGGC3GC CCCTTGGChT 2700 

TCCTCGTTAT TATGGCCCTT TTCATGCTGA ATGTAGGCAG CACCXICCTTC AGCACCTGGT 2760 

GGTTQAGTTA CTGGATCAAG CAAGGAAGCG GGAACACCAC TGTGACTCGA GGGAA06AGA 2820 

GCTGGGTOAG TGACS^GCATG AAGGACAATC CTCATAT6CA GTACTATGOC A6CATCTACX3 2880 

CCCTCTCCAT GGCAGTCATG CTGATCCTGA AAGCCATTC6 AGGAGTTGTC TTTGTCAAGG 2940 

GCACGCTGC6 AGCTTCCTCC CGGCTGCATG ACGAGCTTTT CCOAAGQATC CTTOGAAGCX: 3000 

CTATGAAGTT TTTTGACAOG ACCCCCACAG GGAGGATTCT CAACAGGTTT TCCAAAGACA 3060 

TGGATGAAGT TGA06TGCGG CTGCOGTTCC AGGC0GA6AT GTTCATCC3VG AACGTTATCC 3120 

TCGTGTTCTT CTGTGTGGGA ATGATOGCAO GAGTCTTCXX: GTGGTTOCTT GTQGCAGTGG 3180 

GGCCCCTTGT CATCCTCTTT TCAGTCCTGC ACATTGTCTC CAGQGTCCTG ATTCGGGAGC 3240 

TGAAGOGTCT 6GAC3VATATC ACGCAGTCAC CTTTCCTCTC CCACATCACG TCX^GCATAC 3300 

AGGGCCTTGC CACCATCCAC GCCTACAATA AAGGGCAGGA GTTTCTGCaC AGATACCAGG 3360 

AGCTGCraOA T6ACAACCAA GCTCCTTTTT TTTTGTTTAC GTGTGCGATG CGGTGGCTGG 3420 

CTOTGGGGCT GGACCTCATC A6CAT06CCC TCMCAOCAC CACGGGGCTG ATGATC6TTC 3480 

TTATGCACGG GCAGATTCCC CCAGCCTATG CGGGTCTCGC CATCTCTTAT GCTGTCCAGT 3540 

TAACGGGGCT GTTCCAGTTT ACGGTCAGAC TGGCATCTGA GACAGAAGCT CGATTCACCT 3600 

CGGTGGAOAG GATCAATCAC TACATTAAGA CTCTGTCCTT GGAAGC ACCT GCCAGAATTA 3660 

AGAACAAGGC TCCCTCCCCT GACTGGCCCC AGGAGGGAGA GGTGACCTTT OAGAACGCAG 3720 

AGATGAG6TA CGQA6AAAAC CTOCCTCTTG TCCTAAAQAA AGTATCCTTC ACGATCAAAC 3780 

CTAAA6AGAA GATTGGCATT GTG6GGG06A CASGATCAGG GAAGTCCTCG CTGGGGATGG 3840 

CCCTCTTCCG TCTGGTCGAG TTATCTGGAG GCTGCATCAA GATTGATGCSA GTGA6AATCA 3900 

GTGATATTGG CCTTGCCGAC CTCC6AAGCA AACTCTCTAT CATTCCTCAA GAGCCQGTGC 3960 

TGTTCAGTGG CACTGTCAGA TCAAATTTGG ACCCCTTCSU^ CCAGTACACT GAAGACCAGA 4020 

TTTGGOATQC CCTG6A6M36 ACAC3VCATGA AAGAATGTAT TGCTCAGCTA CJCTCTGAAAC 4080 

TTGAATCTQA AGTOATGOAG AATGGGGATA ACTTCTCAGT GGGGGAAOGG CAGCTCTTGT 4140 

GCATAGCTAG AGCCCTGCTC OGOCACTOTA AGATTCTGAT TTTAGATGftA GCCACAfiCTO 4200 

CCATG6ACAC AGAGACAGAC TTATTGATTC AAGAGACCAT CC6A6AAGCA TTTGCAQACT 4260 

GTACCATGCT GACCATTGCC CATCGCCTGC ACAOSGTTCT AGGCTCOSAT AGGATTATGQ 4320 

TGCTGGCCCA GGGACAG6TG GTGGAGTTTG ACACCCCATC GGTCCTTCTG TCCAACGACA 4380 

GTTCCOGATT CTATGCCATG TTTGCTGCTG CAGAGAACAA GGTOSCTOTC AAGGGCTGAC 4440 

TCCTCCCT6T TGACGAAGTC TCTTTTCTTT A6AGCATTGC CATTCCCTGC CTGGGGCGGG 4500 

CCCCTCATCG CGTCCTCCTA CCGAAACCTT QCCTTTCTCa ATT TTAT CTT TOGCACAGCA 4560 

GTTCCGGATT GGCTTGTGTG TTTCACTTTT AGGGAGAGTC ATATTTTGAT TATTGTATTT 4620 

ATTCCATATT CATGTAAACA AAATTTAGTT TTTGTTCTTA ATTGCACTCT AAAAGGTTCA 4680 

GGGAACCGTT ATTATAATTG TATCAGAGGC CTATAATGAA GCTTTATACG TGTAGCTATA 4740 

TCTATATATA ATTCTGTACA TAGCCTATAT TTACAGTGAA AATGTAAGCT GTTTATTTTA 4800 

TATTAAAATA AGCACTGTGC TAATAACAOT GCATATTCCT TTCTATCATT TTTOTACAGr 4860 

TTGCTGTACT AGAGATCTGG TTTTQCTATT AOACTGTAGG AAGAOTAOCA TTTCATTCTT 4920 

CTCTAGCTGG TGGTTTCACG GTGCCAGGTT TTCTGQGTGT (XAAAGGAAG ACGTGTGGCA 4980 

ATAGTGGGCC CTCCGACAGC CCCCTCTQCC QCCTCCCCAC AGCOGCTCCA QGGGTGGCTG 5040 

GAGACGGGTG 66CGGCIGGA GACCATGCA6 AGCGCCGTGA GTTCTCAGGG CTCCTGCCTT 5100 

CTGTCXrrGGT GTCACTTACT GTTTCTGTCA GGAGAGCA6C QGGGOGAAGC CCAGG CCCCT 5160 

TTTCACTCCC TCCATCAAGA ATGGGQATCA CAGAGACATT CCTCCGAGCC GGGQAGTTTC 5220 

TTTCCTGCCT T CT TCTTTTT GCT6TTGTTT CTAAACAAGA ATCAGTCTAT CCACAGAGAG 5280 

TCCCACTGCC TCAGGTTCCT ATGGCTGGCC ACTGCACAGA GCTCTCCAGC TCCAAGACCT 5340 

GTTGGTTCCA AGCCCTG6AG CCAACTGCTG CTTTTTGAGG TG6CACTTTT TCATTTGCCT 5400 

ATTCCCACAC CTCCACAGTT CAGTGGCAGG GCTCAGGATT TOGTGGOTCT GTTTTCCTTT 5460 

CTCAC0GCA6 TOSTCGCACA 6TCTCTCTCT CTCTCTOCCC TCAAAGTCTG CAACTTTAAG 5520 

CAGCTCTTGC TAATCAGTGT CTCACACTGG COTAGAAGTT TTTGTACTOT AAAGAGACCT 5580 

ACCTCAQGTT GCTGGTTGCT GTGT6GTTTG GTGTCTTCCC GCAAACCCCC TTTG TGCT GT 5640 

GGGGCTGQTA GCTCAGGTGG GGGTGGTCAC TGCTGTCATC AGTTGAATGG TCAGCX3TT6C 5700 

ATGTCGTGAC CAACTAGACA TTCTOTCGOC TTAGCATOTT TGCTGAACAC CTTGIGGAAG 5760 

CAAAAATCTG AAAATGTCAA TAAAATTATT TTGGATTTT6 TAAAAAAAAA AAAAAAAAAA 5820 
AAAAAAAAAA AAAAAAAA 

Seq ID NO I 585 Protein sequeace 
Protein Accession #: NP_005679.1 

1 11 21 31 41 51 

I t I I I I 

MKDIDIGKBY IIPSPGYRSV RERTSTSGTH RDRBDSKPRR TRPLECQDAL STAARAE6LS 60 

IiDASMHSQIiR ILDEEHPKGK YHBGLSAIiKP IRTTSKHQHP VDHAGLFSCM TFSWLSSIjAR 120 

VAHKKGEIiSM EDVWSL8KHE SSDVNCRRLE RLWQEELNEV GPDAASLRRV VWIFCRTRLI 180 

LSIVCLMITQ LAQFSGPAFM VKHLLSYTQA TESNLQYSLL LVLGLLLTEI VRSWSLALTW 240 

AUJYRTGVRL RGAILTMAFK KILKLXHIKB KSLGBLINIC SKDGQRMPEA AAVQSUAGG 300 

PWAIL6MIY NVIILGPTGF LGSAVFILPY PAMMPASRLT AYFRRKCVAA TDESVQXMNB 360 

VLTYIKFIKM YAWVKAFSQS VQKIREEERR ILEKAGYPQ6 ITVOVAPIW VIASWTFSV 420 

HMTLGFDLTA AQAPTWTVF NSMTFALKVT PFSVKSLSEA SVAVDRFKSL FLMEBVHMIK 480 

NKPASPHIKI EMKMATLAHD SSHSSIQNSP KLTPKMiCKDK RASRGKKEKV RQLQRTEHQA 540 

VLAEQKGHLL liDSDERPSPB EEE6KHIHLG HIiRLQRTIiHS IDLEIQEGKL VQXGGSVGSG 600 

KTSIiISAILG QMTLLBGSIA ISGTPAYVAQ QA»lliNATLR DHILPGKEYD BERYNSVLMS 660 

CCLRPDLAIL PSSDLTEIGE RGANIiSGGQR QRISLARALY SDRSIYILDD PLSALDAHVG 720 

NHIFNSAIRK HUCSKTVLFV THQLQYLVDC DEVIFMKEGC ITERGTHEEL MNIjNGDYATI 780 

FMNIiLLGETP PVEINSKKET SGSQKKSQDK GPKTGSVKKB KAVKPBEGQL VQLBBKGQGS 840 
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VPHSVyeVYI QMGGPLAFL VIHALFHLNV GSTAFSmWL SYHIKQGSQI 
VSDSHK08PH HQnASiyAL SMAVMblLKH IfiSWFVKQT LRASSRUIDB 
KFFDTTPIGB ILNRFSXDHD BVDVRbPFQH EMFIOMVILV FFCVGHIASV 
LVILF8VI.BI VSRVLIRBLK RLDHITQSPF LSHITSSIQG lATIHAnnOS 
LDDNQAPFFIi ?TCAMSMLAV RLDLISIALI TTTGLMIVLM RGQIFPAYAS 
GbFQFTVKIA SBTEARFTSV BRINBITIKTL SLEAFARIKN KAPSPOHPQE 
HYKEHMLVL KKVSPTIKPK EKIGIVGETG SGKS8L6NU PR1.VELSGGC 
IQIADUtSXb StIPQBPVLP SOTVSSIILDP FHOYTEDQIN DAI.ERTHMKE 
SBVMBIBia)? SVOBRQLLCI ARALLSBCKI ULDBhTAAN DTBTObLIOE 
MLTUBBIiBr VLQSDRIKVL AQaQWBFOT PSVUiSHDSS RFMMFAAAB 

Seg ID KOt 5B6 RA sequence 

Hucleie Acid Ao^esslon «: 11M_001327.1 

Coding sequence I 89. .631 



PCTAJS02/12476 



TTVTRGNETS 
LFRRILRSPM 
PPWPLVAVGP 
QEPLHRYQEL 
lAISYAVQLT 
GBVTFENAEM 
IKIDGVRISD 
CIAQLPItKLE 
TIREAFADCT 
NKVAVKO 



900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 



1 
I 

AGCAGGGGGC 
CTGAGA6CCG 
GAOGGGCGAT 
TGGCGGCCCA 
AAG66CCT0Q 
GCTGAATGGA 
CCTCX3CCATG 
GGATGCCCCA 
CATACTGACT 
CTGTCTCCAG 
GGCTCAGCCT 
GCCTCCTCCC 
GTTTGTC6CT 



11 

I 

GCTGTGTGTA 
GGCAGA6GCT 
GCTGATGGCC 
6GAGAGGCGG 
GGGCCG6GAG 
TGCTGCAGAT 
CCTTTCGCGA 
CCGCTTCCCG 
ATCOGACTGA 
CAGCTTTCCC 
CCCTCAGG6C 
CTAGGGAATO 
GQAG6A6GAC 



21 
1 

CCGAQAATAC 
CGGGAGCCAT 
CAGGAGGCCC 
GTGCCACGGG 
QAGGCX3CCCC 
GCGGGGCCAG 
CACCCATGGA 
TGCCAGGGGT 
CTGCTGCAGA 
TGTTGATGTG 
AGAGGCGCTA 
6TCCCA6CAC 
GGCTTAGftTO 



31 

} 

GAGAATACCT 
GCAGQCOGAA 
TGGCATTCCT 
CGGCA6AGGT 
GCGGGGTCCG 
GGGGCCGGAG 
AGCAGAGCTG 
GCTTCTGAAO 
CCACCGCCAA 
GATCACOCAQ 
AGCCCAGCCT 
6AGTGQCCA0 
TTTGTTTCTG 



41 

I 

CGTGGGCCCT 
GGC06GG6CA 
GATG6C0CA0 
CCC06G60QG 
CATGGCGGCX5 
AGCOGCCTGC 
GCCOGCAGQA 
GAOTTCACTG 
CTGCAGCTCT 
TGCTTTCTGC 
GGOGCCCCTT 
TTCAITGTGG 
TAGAAAATAA 



51 
I 

GACCTTCTCT 
CAGGGGGTTC 
GGGGCAATGC 
CAGGGGCAGC 
C6GCTTCAGG 
TTGAGTTCTA 
GCCTGGCXXZA 
TOTCGGQCAA 
CCATCAGCTC 
CCGTGTTTTT 
CCTAGGTCAT 
GGGCCTGATT 
AACTQAGCTA 



8eq ID NO: 587 Protein sequence 
Protein Accession #; NP_0 013 18.1 

1 IX 21 31 41 51 

I i I I I I 

MQAEGRGTGG STGDAZJGPGG PGIPZ7GPGGN AGGPGBAGAT GGRGPROAGA AKA5GPGGGA 
PRGPHGGAAS GIiNGCCRCXSA RGPESRLLEP YLAMPFATPM EAELARRSLA QDAPPLFVPG 
VLLKEFTVSG NILTIRLTAA DHRQLQLSIS SCLQQIiSLLM WITQCFLPVF LAQPPSGQRR 

8eq ZD vtOt 588 DNA sequence 

Nucleic Acid Accession fti Eos sequence 

Coding sequence t 52.. 459 



CCT0GTG6GC 
GAAGGCCAGG 
GCTGATGGCC 
GGTCCCOGGG 
006CATG600 
GACAGCCGCC 
ATCAGCTCCT 
GTGTTTTTGG 
TAGGTCAT6C 
GCCTGATTGT 
CTGAGCTA 



11 
I 

CCTGACCTTC 
GCACAGGGGG 
CAGGGGGCAA 
G06CAG6GGC 
GT6C0GCTTC 
TGCTTCAGTT 
GTCTCCAGCA 
CTCA6GCTCC 
CTCCTCCCCT 
TTGT06CTG6 



21 
I 

TCTCTGAGAG 
TTCGACGGGC 
TGCTGGCGGC 
AGCAAGG6CC 
TGCGCAGQAT 
CCGACTGACT 
6CTTTCCCTG 
CTCAGGGCAG 
AGGGAATG6T 
AGGAGQ^^GG 



31 
I 

C0GG6CAGAG 
GATGCTGATG 
CCAGGA6AGG 
TGGG66C0QA 
GGAAGGTGGC 
GCTGCAGACC 
TTGATGTGGA 
AGGCGCTAAG 
CCCA6CA0GA 
CTTACATGTT 



41 

1 

GCTCCOQAGC 
GCCCAGGAGG 
CG6GTQCCAC 
GAG6AGGGGC 
CCTGOGGGGC 
ACOGCCAACT 
TCACGCAGTG 
COCAGCCTGG 
GIGGCCAGTT 
TGTTTCTGTA 



51 

1 

CATGCAGGCC 
CCCTGGCATT 
6GGCGGCAGA 
CCCGOGGGGT 
CAGGAGGCC6 
GCAGCTCTCC 
CTTTCTGCCC 
OGCCCCTTCC 
CATTGTGGGG 
GAAAATAAAO 



Seq to NO: 589 Protein sequence 
Protein Accession ft: Eos sequence 



11 



21 



41 



51 



31 

I I I 

NQAEGQGTG6 STGDADGPGG PGIFDGPG6N AOGPGBAGAT GGRGPRGA6A ARA5GPRGGA 
PRGPHGGAAS AQDGRCP06A RRPOSRLLQF RLTAADHRQL QLSISSCLQQ LSIiLHWITQC 
FLPVFLAQAP SGQRR 



Seq ID NO I 590 DNA sequence 
Nucleic Acid Accession #: NM_005S62.1 
Coding sequence: 90.. 3 671 



ACAGOGGAQC 
AGACAQAGAC 
GCTTCTCGCT 
ATGGGAAGTC 
TOOGCTGCCT 
GCTTTTACCQ 
CTCTTAGTGC 
CCAGATGCGA 
ACCAGAGACT 
AGGCQQQCCG 
CAGGTTACTA 
GGCATTCAGC 
TTCATCAAGA 
AATGGTCACA 



11 

1 

6CA6AGTGA6 
TGAGCGGCCC 
CCTCCTGCCC 
CAGGCAGTGT 
CAACTGCAAT 
GCACAGAGAA 
TCGATGTGAC 
CCGATOTCTG 
GCTAOACTCC 
CTQT8TCTGC 
TAATCTG6AT 
CAGCTGCCGC 
TGTTGATGGC 
GCGCCATCAA 



21 

I 

AACCACCAAC 
GGCACCGCCA 
GCAGCCOSGG 
ATCTTTGATC 
QACAACACTQ 
AGGGAG06CT 
AACTCTGGAC 
CCAGGCTTCC 
AAGTGTGACT 
AAGCCAQCra 
GGGGGGAACC 
AGCTCTOCAG 
TGGAAGGCT6 
GAT6TGTTTA 



31 
I 

CGAGG0GC09 
TGCCTQCGCT 
CCACCTCCAG 
GGGAACTTCA 
ATGGCATTCA 
GTTTGCCCTO 
GGTGCA6CTG 
ACATGCTCAC 
GTGACGCAGC 
TTACTGGAGA 
CTGAGQGCTO 
AATACAGTGT 
TCCAAOGAAA 
GCTCAGCCCA 



41 

I 

6GCAG0GACC 
CTGGCTGGGC 
GAGGGAAGTC 
CAGACAAACT 
CT QOGAG AAG 
CAATTGTAAC 
TAAACCAG6T 
GGATG06GGG 
TGGCATCGCA 
AOGCiaTGAT 
TACGCAGTQT 
CCATAAGATC 
TGGGTCTCCT 
AC6ACTAGAC 



51 

1 

CCTGCAGOGG 
TGCTGCCTCT 
TGrGATTGCA 
G6TAATGGAT 
TGCAAGAATG 
TCCAAAG6TT 
GTGACAGGAG 
TGCACCCAAG 
GGGCCCTGTG 
AOGTOTOGAT 
TTCT6CTATG 
ACCTCTACCT 
GCAAAGCTCC 
CCTGTCTATT 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 



60 

120 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 



60 
120 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
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TTGTGGCTCC TGCC3^TTT 
TTGACTACCS TGTGGACAGA 
GTGCTGGTCT ACX3GATCACA. 
TCACCAAGAC TTACACATTC 
TGAGTTACTT T6AGTAT0GA 
CATATGGAGA ATACAGTACT 
TCTCTGGAGC CCCAGCACCC 
AATTCTGCCA GGATTGTGCT 
GCACCTGTAT TCCTTGTAAC 
GTTATTCAGG GGATGAGAAT 
ACGATCCX^ CGACCCCCGC 
CAGTGATGCC QGA6ACGGAG 
CCOSCTGTQA GCTCTGTGCT 
TGAQGGCTTG TCAGCCCTGT 
OTGACGGGCT GACAG6CAGG 
ACCAQTGCAA AGCAGGCTAC 
OAGCTTQCAA CTGTAACCCC 
GTGTTTGCAA GCCAGGATTT 
CTTGCTATAA TCAAGTGAAG 
AGGCCCTGAT TTCAAAGGCT 
GCAGGATGCA GCAOGCTGAG 
AAGGTGCTAG CA6ATCCCTT 
ACCAGAGCCG CCTOGATOAC 
AGTACCAGAA CCGAGTTOQG 
CAGAAAGTGA AGCTTCCTTG 
CAAAT6GCTT TAAAAGTCTG 
CAGCCAGTAA CATGGAGCAA 
CACTGGTGCG CAAGGCCCTG 
CTGTGGTGCA AGGGCTTGTG 
CAAGGGAGGC CACTCAAGCG 
TCCTGQATTC AGTGTCTOGG 
CAAA6AGGAT CAAACAAAAA 
AGTTCAAGCG TACACAAAAG 
AGAATG6AAA AAGTGGGAGA 
AAAGCAGA6C ACAAGAAGCA 
TCCTTAAAAA CCTCAGAGAG 
AAGCCAT6AA GAGACTCTOC 
AGCAAGCAOA AAGAGCCCTO 
CCGQGGAGGC CCTGGAAATC 
AAGCCAATGT GACAGCAGAT 
GTGAGATGAO GGAAGTGGAA 
TGGATGCAGT ACAGATGGTG 
CTGGOGTTAC AATCCAAGAC 
AGCCTCTCAG TCTAGATGAA 
AGACCCAGAT CAACAGCCAA 
AGCAGAGGGG CCACCTCCAT 
AGAACTT6GA 6AACATTAGG 
AGCAACAGTG AAGCTGCCAT 
GGCTCGGGAG CCATGTCATG 
TATGCTCAGG TCAACTQACC 
TGCACCATAC TCCTTGCTTC 
ATGATCAAGG ATCTGGACXX: 
ATGTrrGCCT CATAATAGTC 
ATAGTCAACT TATTCTTTGA 
ATGAAATTCT TCCTAATGTC 
ACTATTGCCT CATATTGTCC 
ACX:CAGGGTG TGAACATGTT 
AGGACCTGTA AGGCAGGCCC 
GTTCTGGACC TGGGCATGAC 
ATTTTTATTA AAOCATTTCC 
GTTTCAAAGT GATAGAAAAO 
ATTAGTCCTA ATTCAATCCT 
ATCTTATTTT CTCAATCTCC 
CACACTTCAG CTGGGTCACA 
TTACCTCCAT CCATCCTTCC 
GTGQGACAGT 6GT6ACATAG 
AGCATTTTTA AAAAATAAAT 
GCAATAACCG CTTGGTTTGC 
CATGGGGGCA CTTGAGTTTT 
CATTCCA6CT GTCACTCTGT 
TAACACCAQT GGGAATTGCT 
TGGTGCTGCC TTQCTTCTGT 
CAATTGTTAG ATOCC 



CTTGGGAATC 
GGA66CA6AC 
GCTCCCTTGA 
AGGTTAAATG 
AGGTTACTGC 
GGGTACATTQ 
TGGGTTGAAC 
TCTGGCTACA 
TGTCAAGGGG 
CCTGACATTG 
AGCTGCAA6C 
GAGGTGGTGT 
GATGGCTACT 
CAATGCAACA 
TGTTTGAAGT 
TTCGQGGACC 
ATGGGCTCAO 
GGTGGCCCCA 
ATTCAGATGG 
CAGG6TGGTG 
CAGGCCCTTC 
GGTCTCCAGT 
CTCAAGAT6A 
GATACTCACA 
GGAAACACTA 
6CTCAGGAGG 
CTGACAAGG6 
CATGAAGGA6 
GAAAAATFGG 
GAAATTGAAG 
CTTCAGGGAO 
GC6(SVTTCAC 
AATCTG6SAA 
GAGAAATCA6 
CTGAGTATGG 
TTTGACCTGC 
TACATCA6CC 
GGGAGOGCTG 
TCCAGTGASA 
GGAGCCTTGG 
GGAGAGCTGG 
ATTACAQAAG 
ACACrCAACA 
GAGGGGCTGO 
CTGOSQOCCA 
TTGCTGGAGA 
GACAACCTGC 
AAATATTTCT 
TGA6TGGGTG 
TGftCCCCATT 
CTGATGCTGG 
CAAAGAATAG 
GTAAGTGGAG 
GTAATGTGAC 
AGAACAahGT 
TCTQCAAGCT 
CTCCATTTTC 
ATTCAGAGCT 
ATCCTTTCTT 
TACCA OOUiA 
T0TG0CTT60 
ACTTTTCGAA 
TCTCTCTTTC 
TCCATCCCTC 
AACATATATT 
TCTCTGCCCT 
TTAAACTTAC 
AACCTCTTTG 
GGCAAGGCTG 
6CXTTTCTAC 
GGAGGAACCA 
ATTTCCTTGG 



AACAGGTGAG 
ACCCATCTGC 
TGCCACTTGG 
AGCATCCAAG 
G6AATCTCAC 
ACAATGTQAC 
A6TGTATATG 
AGAGAGATTC 
GAGGGGCCTG 
AGTGTGCTGA 
CATGTOCCTG 
QCAATAAGTG 
TTGGOQACCC 
ACAATGTGQA 
GTATCCACAA 
CATTGGCTCC 
AGCCTGIAGO 
ACTGTGAGCA 
ATCAGTTTAT 
ATGGAGTAGT 
AGGACATTCr 
T6GCXAAGGT 
CTGTGGAAAG 
GGCTCATCAC 
ACATTCCTGC 
CCACAA6ATT 
AAACTGAGGA 
TOSGAAGOGG 
AGAAAACCAA 
CAGATAGGTC 
TCAGTGATCA 
TCTCAACGCT 
ACTGGAAA6A 
ATCAGCTGCT 
QCAATGCCAC 
AGGTGGACAA 
AGAAGGTTTC 
CTGCTGATGC 
TTGAACAGGA 
CCATGGAAAA 
AAAGGAAGGA 
CCXAGAAGGT 
CATTAGACGG 
TCTTACTGGA 
TGATGTCAGA 
CAAGCATAGA 
CCCCAGGCTG 
CAACT6AGGT 
GGATGGGGAC 
CCTGKTCCCA 
GCAAT6AGGC 
ACTGGATQGA 
TCCTQGAATT 
TAAAGGAAAA 
GCAACCCA6T 
TCTTGCTGAT 
AAGCTGGAAG 
ATGGTGCTTG 
TTAATGATGC 
GCAAAT6TTG 
GCATTGAAAG 
CACCAAAAAT 
CTCC3VCCCAT 
CATTCATCCT 
TATTGAGTAC 
CATAGAGTTG 
AAACTTTGTT 
CTCAACAGAA 
ACAGAGCTCT 
AACTGATTGC 
GAGGCACTTC 
ATTTTCCTGA 



CTATGGGCAA 
CCATGATGIQ 
CAAGACACT6 
CAATAATTGG 
AGCCCTCCGC 
CCTGATTTCA 
TCCTGTTGGG 
AGCGAGACTG 
TGATCCAGAC 
CTGCCCAATT 
TCATAACGGG 
CCCTCCCGGG 
CTTTGGT6AA 
CCCCAGTOCC 
CACAGCCGGC 
CAACCCAGCA 
ATOTOSAAOT 
TGGAGCATTC 
GCAGCAGCTT 
ACCTGATACA 
GAGAGATGCC 
GAGGAGCCAA 
AGTTOGGGCT 
TCAGATGCAG 
CTCAGACCAC 
AGCA6AAAGC 
CTATTCCAAA 
AAGOGGTAGC 
GTCCCTGGCC 
TTATCAGCAC 
GTCCTTTCAG 
6GTAACCAGG 
AGAAGCACAG 
TTCCCGTGCC 
TTTTTATQAA 
CAGAAAAGCA 
AGATGCCAGT 
ACAGAGGGCA 
GATTGGGAGT 
GGOACIGSCC 
GCTGGAGTTT 
TGATACCAGA 
CCTCCTGCAT 
6CA6AAGCTT 
GCTGGAAGAG 
TGGGATTCTG 
CTACAATACX: 
TCTT6GGATA 
ATTTGAACAT 
TGOCCAGGTG 
AGATA6CACT 
AAGACAAACT 
TGGACAAGTG 
AACTTTGACT 
CACACTGTGG 
aUSAGTTCCT 
AA6TGAGCAG 
CTGGTGCCTG 
CATGGCAACT 
GGAAAGTATT 
AGGTAAAATT 
GATGCGCATC 
AATAAGA8AA 
TCX3^TCCATC 
CTACTGTGTG 
ATTGTCTAGT 
TGTCACAA6T 
CATATGTTGC 
G6GTTGTGCA 
AACAGACTGT 
CACCTTGGCT 
AA6TGTTTTT 



AGCCTGTCCT 
ATTCtGQAAG 
CCTTGT6GGC 
AGCCCCCAGC 
ATCCGAGCTA 
GCCCGCCXn^ 
TACAAGGGQC 
GG6CCTTTTG 
ACAGGAGATT 
GGTTTCTACA 
TTCAGCTGCT 
QTCACGQGTG 
CATGGCCCAG 
TCTGGQAATT 
ATCTACTGCX3 
GACAAGTGTC 
GATGGCACCT 
AGCTOTCCAG 
CAGA6AATGG 
GAGCTGGAAG 
CAGATTTCAG 
GAGAACAGCT 
CTGGGAAGTC 
CTGAGCCTGG 
TACGTGGG6C 
CACGTTGAGT 
CAAGCCCTCT 
CCGGACGGTG 
CAGCAGTTGA 
AGTCTCCGCC 
GTGGAA6AAG 
CATATGGATG 
CAGCTCTTAC 
AATCTT6CTA 
GTTGAGAGGA 
QAAGCTGAAG 
GACAAQACCC 
AAGAATGGGG 
CTGAACTTGG 
TCTCTQAAGA 
6ACACQAATA 
GCCAAGAAOG 
CTGATGGACC 
TCCOGAGCCA 
AGGGCAGGTC 
GCTGATGTGA 
CAGGCTCTTG 
CAGATCTCAG 
GTTTAATGGG 
GTTGTCTTAT 
GGGTGTGAGA 
GCACAG6CAG 
CTGTTGGGAT 
TTGCCCAGGC 
CCAGTAAAAT 
CCTACTTACA 
TGTTGGAGTG 
CCACCTTCAA 
TAGAGATTGC 
TACTTTTTCQ 
CTCTAGATTT 
AATGTATTTT 
TOTTOCTACT 
TTTCCATCCA 
CCAGGGGCTG 
GAGGAAGACA 
GGTGTTTATT 
AAGAGCCTCC 
CATTTCTTTG 
TGAGTTATGA 
GGGAAGACTA 
AAATAAAQAA 



900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3460 
3540 
3600 
3660 
3720 
37B0 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4260 
4320 
4380 
4440 
4500 
4560 
4620 
4680 
4740 
4600 
4860 
4920 
4980 
5040 
5100 
5160 



Seq ID MO I 591 Protein sequence 
Protein Accession #x NP_005S53,X 



1 
I 

MPALWLGCCL 
DGIKCEKCKN 
HMLTDAGCTQ 
PEGCTQCFCY 
SSAQRLDPVY 
MPL6KTLPCG 
DNVTLISARP 
GGACDPDTGD 



11 

1 

CFSItLLPAAR 
GFYBHRERDR 
DQRLLDSRCD 
OHSASCRSSA 
FVAPAKFZjGN 
LTKTYTPRLN 
VSGAPAPWVE 
CYS6DENPDI 



21 
I 

ATSRREVCDC 
OiPClTQTSKG 
CDPAGIA6PC 
EYSVEKITST 
QQVSYGQSI18 

EKPsmmspQ 

QCICPVGYKG 
ECADCPIGFY 



31 

I 

NGKSRQCIPD 
SLSARCDKSG 
DA^CVCKPA 
FHQDVDGWKA 
PDYRVDRGGR 
LSYFEYRRLL 
QFOQDCASGY 
IIDPHDPRSCK 



41 
I 

RELHRQTOIG 
RCSCKPGVTG 
VTGERCDRCR 
VQRN6SPAKI1 
HPSAHDVILE 
RNLTALRIRA 
KRDSARLGPF 
PCPCHNGFSC 



51 

FRCLHCMDMT 60 

ARCDRCLPaP 120 

SGYYNLDG6N 180 

QWSQRHQDVP 240 

QAQLRITAPL 300 

TYGBYSTGYI 360 

GTCIPCHCQG 420 

SVMPBTEBW 480 



415 



wo 02/086443 

CMNCPPGVTa ARCEtiCAOGY FODPFGEHGP VRPCQPOQCM NNVDPSASGH CDRLTGRCLK 540 

CIHHTAOIYC DQGKAGYFGD PUIFNPADKC RACNCHPM6S EPVGCRSDGT CVCKPGF6GP 600 

NCEHGAFSCP ACYNQVKIQM DQFMQQI«QRM EALISKAQGG DGWPDTELE GRMQQAEQAL 660 

QDILROAQIS E6ASRSL6LQ LAKVRSQEHS YQSRLDDLKM TVERVRALGS QYQNRVSDTH 720 

RLZTQMQLSL AESEASLGNT KIPASDHYVG FMGPK8I1AQB ATRLAESHVfi SASNHEOLTR 780 

ETBDYSKOAL SLVRKALHEG VGSG56SPDG AWQGLVEKL EKTKSLAQQL TREATQABIC 840 

ADRSYQHSIiR LLDSVSRLQG VSDQSFQVEE AKRIKQKADS IiSTLVTRHMD EFKRTQKNt.G 900 

NWKEEAQQIiL QN6KSGREKS DQLLSRANLA KSRAQEAI«SM GMATFYBVBS ILKKLREFDL 960 

QVDNRKABAE EAMKRLSYI8 QKV8DASDKT QQAERALGSA AADAQRAXNG A6EALBISSB 1020 

lEQElGSUJL EANVTADGAL AMEKGI«ASLK SBMRBVEt^ ERXEUBFDfm MDAVQMVITE 1080 

AQKVDTRAKN AGVTIQDTIiN TLOGLIiKLMD QPLSVDEEOL VLLEQKLSRA KTQZNSQItRP 1140 
^e4SELEERAR QQRGHIiHLLB TSIDGILADV KISLEmRDNL PPGCXKTQAL EQQ 

Seq ID KOt 592 DHA sequence 
Nucleic Acid Accession ft: AF101051.1 
Ooding sequence: 221.856 

1 11 21 31 41 51' 

I I I I I I 

GAGCAACCTC AGCTICTAOT ATCCAOACTC CAGGGCOQCC CCQG6GGGGG ACCCCAACCC 60 

CGACCCAQAG CTTCTCCAGC GQOGGOGCAO C3GAGCAGG6C TCCCCX3CCTT AACTTCCTCC 120 

GCGGGGCCCA GCCACCTTCG GGAGTCCGGG TTGCCCACCT GCAAACTCTC CGCCTTCTGC 180 

ACCTGCCACC CCTGAGCCAG CG0QGGCX5CC CGAQOQAGTC ATGGCCAAOG CGGGGCTGCA 240 

6CTGTT6GGC TTCATTCTCG CCTTCCTGGO AT6GAT0GGC GCCATOGTCA GCACT6CCCT 300 

GCCOCAGTOG AGGATTTACT CCTATGGGGG CQACAACATC GTGACCGCCC AGGCXAT6TA 360 

CGAGGGGCTG T6GATGTCCT GCGTGTOGCA GA6CAC0GGG CAGATCCAGT GCAAAGTCTT 420 

- TGACTCCTTG CTGAATCTGA GCAGCACATT GCAA6CAACC OGTGCCTTGA TGGTGGTTGG 480 

CATCCTCCTG GGAGTGATAQ CAATCTTTGT GGCCACCGTT GGCATGAAGT GTATGAAGTQ S40 

CTTGGAAGAC GATQAGQTGC AGAAGATGAG GATGGCTGTC ATTGGGGGTG OGATATTTCT 600 

TCTTGCAOGT CIGGCTATTT TAGTTGOCAC AGCATGOTAT GGCSVATAGAA TCaTrCAAOA 660 

ATTCTATQAC CCTATGACCC CAGTCAATGC CAGGTACGAA TTTGGTCAGO CTCTCTTCAC 720 

TG6CTGGQCT GCTGCTTCTC TCT6CCTTCT GGQAQGTGCC CTACTTT6CT GTTCCTGTCC 780 

COGAAAAACA ACCTCTTACC CAACACCAAG GCCCTATCX:A AAACCTGC31C CTTCCAQOGG 840 

6AAAGACTAC GTGTGACACA GAGGCAAAAG GAGAAAATCA TGTTGAAACA AACCX3AAAAT 900 

GGACATTGA6 ATAC7ATCAT TAACATTAG6 ACCTTAGAAT TTTGGOXATT GTAATCTGAA 960 

GTATGGTATT ACAAAACAAA CAAACAAACA AAAAACCCAT GTGTYAAAAT ACTCAS TGCT 1020 

AAACATGGCT TAATCTTATT TTATCTTCTT TCCTCAATAT AGGAGGGAA6 ATTTTACCAT 1080 

TTGTATTACT GCTTCCCATT GAGTAATCAT ACTCAAATGG GGGAAGGGGT GCTCCTTAAA 1140 

TATATATAGA TATGTATATA TACATGTTTT TCTATTAAAA ATAGACAGTA AAATACTATT 1200 

CTCATTATGT TGATACTAOC ATACTTAAAA XATCTCTAAA AXAG6TAAAT GTATTTAATT 1260 

CCATATTGAT GAAOATGTTT ATTGGTATAT TTTCTTTTTC GTCCTTATAT ACATATOTAA 1320 

CAGTCAAATA TCATTTACTC TTCTTCATTA GCTTTGQGTG CCTTTGCCAC AAGACCTAGC 1380 

CTAATTTACC AAGGATGAAT TCTTTCAATT CTTCATGCGT 6CCCTTTTCA TATACTTATT 1440 

TTATTTTTTA CCATAATCTT ATAGCACTTG CATCGTTATT AAGCCCTTAT TTGTTTTGTG 1500 

TTTC9VTTG0T CTCTATCTCC T6AATCTAAC ACATTTCATA GCCSACKm TAGTTTCTAA 1560 

AGCCAAGAAO AATTTATTAC AAATCAGAAC TTTG6A6GCA AATCTTTCTG CATGACX»AA 1620 

GTGATAAATT CCTGTTGACC TTCCCACACA ATCCCTGTAC TCTQACCCAT AGCACTCTTG 1680 

TTTGCTTTGA AAATATTTGT CCAATTGAGT AGCTQCATGC TGTTCCCOCA GGTGTTGTAA 1740 

CACAACTTTA TTGATTGAAT TTTTAAGCTA CTTATTCATA GTTTTATATC CCCCTAAACT 1800 

ACCTTTTTGT TCCCXMTCC TTAATTOTAT TOTTTTCOCA AGTGTAATTA TCATGOGTTT I860 

TATATCTTCC TAATAAGGTG TGGTCTGTTT GTCTQAACAA AGTGCTAGAC T TTCT GGAGT 1920 

QATAATCT6G TOACAAATAT TCTCTCT6TA GCTGTAAGCA AGTCACTTAA TCTTTCTACC 1980 

TCTTTTTTCT ATCTGCCAAA TTGAGATAAT GATACTTAAC CAGTTAGAAG AGQTAGT6TG 2040 

AATATTAATT AGTTTATATT ACTCTCATTC TTTGAACATG AACTATGCXn* ATGTAGTGTC 2100 

TTTATTTGCT CA6CTGGCTG A6ACACT6AA GAAGTCACTG AACAAAACCT ACACACGTAC 2160 

CTTCATGTGA TTCACTGCCT TCCTCTCTCT ACCAGTCTAT TTCCACTGAA CAAAACCTAC 2220 

ACACATACCT TCATGTGGTT CAGTGCCTTC CTCTCTCTAC CAGTCTATTT CCACTGAACA 2280 

AAACCTAOGC ACATACCTTC ATGTGGCTCA GTGCCTTCCT CTCTCTACCA GTCTATTTCC 2340 

ATTCTTTCAG CTGTGTCTGA CATGTTTGTG CTCTGTTCCA TTTTAACAAC TGCTCTTACT 2400 

TTTCCAGTCT GTACAGAATG CTATTTCACT TGAGCAAGAT GATGTATGGA AAGGGTGTTG 2460 

GCACTGGTGT CTGGAGACCT 06ATTTGAGT CTTGGTQCTA TCAATCACCG TCTGTGTTTG 2520 

AGCAAGGCAT TTGGCTGCTO TAAGCTTATT GCTTCATCT6 TAAGCGGTGG TTTGTAATTC 2580 

CTQATCTTCC CACCTCACAO TGATGrTGTG 6GGAT0CAGT OAGATAGAAT ACATGTAAGT 2640 

GTGGTTTTGT AATTTGAAAA GTQCTATACT AAGGGAAAGA ATTGAG6AAT TAACTGCATA 2700 

ctarm m j i m ttgcttttca aatgtttgaa aataaaaaaa tgttaagaaa togotttctt 2760 

GCXTTTAACXa^ GTCTCTCAAG TGATGAGACA GTGAAGTAAA ATTGAGTGCA CTAAAOGAAT 2820 

AAGATTCTGA GGAAGTCTTA TCTTCTGCAG TGAGTATGGC CCAATOCTTT CTGTGGCTAA 2880 . 

ACAQATGTAA TGGGAAGAAA TAAAAGOCTA GGTGTIGGTA AAT0CAACA6 CAAOGGAGAT 2940 

TTTTQAATCA TAATAACTCA TAA6GTGCTA TCTGTTCAGT GA7QCCCTCA OAGCTCTTGC 3000 

TGTTAGCTGQ CAGCTGACGC TGCTAGGATA GTTAGTTTGQ AAATGGTACT TC ATAA TAAA 3060 

CTACACAAGG AAAGTCAGCC ACCGTGTCTT ATGAGGAATT- GGACCTAATA AATTTTAGTG 3120 

TGCCTTCOm ACCTGAGAAT ATATGCTTTT 6GAAGTTAAA ATTTAAATGG CTTTTGCCAC 3180 

ATACATAOAT CTTCATQATO TGTGAGTGTA ATTCCATGTG OATATCAGTT ACCAAACATT 3240 

ACAAAAAAAT 7TTATQGCCC AAAATGACCA AOQAAATTGT TACAAXAGAA TTTA T CCftAT 3300 

TTTGATCTTT TTATATTCTT CTACCACACC TGGAAACAQA CCAATAGACA TTTTGQGGTT 3360 

TTATAATGGG AATTTOTATA AAGCATTACT CTTTTTCAAT AAATTCTTTT TTAATTTAAA 3420 
AAAAG6AAAA AAAAAAAAAA AAA 

Seq ID Wt 593 Protein sequence 
Protein Accession #: AAD16433.1 

1 11 21 31 41 51 

I I I I I I 

MANAGLQLLG FXLAFLGWIO AIVSTAIiPQH RIYSYAGDIIX VTAQAHYEGL WMSCVSQSTG 60 

QIQCKVPDSL LNLSSTLQAT RALMWGILL GVIAIFVATV GMKCHKCIiED DEVQKMRMAV 120 

IGGAIFLLAG LAILVATAWY QNRIVQEFYD PMTPVNARYE FGOALFTOWA AASimUSQA 180 
IiLCCSCPRKT TSYPTPRPYP KPAP5SGKDY V 
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WO 02/086443 

Seq ID NO: 594 DNA sequence 

Nucleic Acid Accession #: NM_006180.1 

Coding sequence i 352.. 2820 



PCTAJS02/12476 



11 



21 31 41 51 

CCCCCATTCQ CATCTAACAA GGAATCTGCG CCCCAGAGAG TCCCGGA06C C6CCGQTCX30 
TOCCOGOOGC GCCGGOCCAT GCAGOSACGG CCGCCX3CGGA GCTCOSAGCA GCGGTAGCGC 
CCCCCTGTAA AGGGaTTOOC TATGCCGGGA CCACTGTGAA CCCTGCCGCC TGCCGGAACA 
CTCTTCGCTC CGQACCAGCT CAGCCTCTGA TAAGCTGOAC TCQGCACGCC OGCAACAAGC 
ACCGAGGAGT TAAGAGAGCC GCAAGCGCAG GGAAGGCCTC CCCGCROQGG TGOOafflgAG 
OGGCCGGTGC AGCGCX3GGGA CAGGCACTCG GGCTGQCACT GOCTGCTAOG GATGTOOTOT 
TOGATAAGGT GGCATOGACC CGCCATGGCG CGGCTCTGGG GCTTCTGCTG GCTGGTTGT6 
GOCrTCTGOA GQGCCGCTTT CGCCTGTCCC ACGTCCTGCA AATGCAGTGC CTCTCGGATC 
TGGTGCAGCG ACCCTTCTCC TGGCATCGTQ GCATTTCOQA GATTGGAGCC TAACAGTGTA 
- 6ATCCTQAGA ACATCACCGA AATTTTCATC GCAAACCAQA AAACSGTTAGA AATCATOAC 
GAAGATGATG TTGAAGCTTA TGTGGGACTO AGAAATCTGA C3UITK3TGGA TTCTGGATTA 
AAATTTGTGG CTO^TAAAGC ATTTCTQAAA AACAGCAACC TGCAGCACAT CAATTTTACC 
CORAACAAAC TGACGAGTTT GTCTAGGAAA CATTTCCGTC ACCTTQACTT GTCT6AACTG 
ATCCTGGTGG GCAATCCATT TACATGCTCC TCTGACATTA TGTGGATCAA GACTCTCCSVA 
GAGQCTAAAT CCAQTCCAGA CACTCAGGAT TTOTACTGCC TGAATGAAAO CAO CAA GAAT 
ATTCCCCTGG CAAACCTGCA GATACCCAAT TGTGQTTTGC CATCTGCAAA TCTGGCCGCA 
CCTAACCTCA CTGTGGAGGA AGGAAAGTCT ATCACATTAT CCTGTAGTGT GGCAGGTGAT 
COGGTTCCTA ATATGTATTG GGATGTTGGT AACCTGGTTT CCAAACATAT GAATGAAACA 
AGCCACACAC AGGGCTCCTT AAGGATAACT AACATTTCAT CCGATGACAG TGGGAAGCAG 
ATCTCTTGTG TGGCGGAAAA TCTTGfTAGOA SAAGATCAAG ATTCTGTCAA CCTCACTGTG 
CATTTTGCAC CAACTATCAC ATTTCTCQAA TCTCCAACCT CaGACCACCA CTGGTGCATT 
CCATTCACTG TGAAAGGCAA CCCCAAACCA GC6CTTCAGT GGTrCTATAA CGGG6CAATA 
TTGAATGAGT CCAAATACAT CTCTACTAAA ATACATGTTA CCAATCACAC GGAGTACCAC 
GGCTGCCrCC AGCTGGATAA TCCCACTCAC ATGAACAATG GGGACTACAC TCTAAW^CC 
AAGAATOftCT ATGGOAAGGA TGAGAAACAG ATTTCTGCTC ACTTCATGGG CTGGCCTGOA 
ATT^OOATO OTGCAAACCC AAATTATCCT OATGTAATTT ATGAAGATTA TGOAACTGQ^ 
GCGAATGACA TCGGGGACAC CACGAACAGA AGTAATQAAA TCCCTTCOVC AGACQTCACT 
QATAAAACCG GTCGGGAACA TCTCTCX3GTC TATGCTQTGG TGGTOATTGC gTCTGT GG TG 
GGATTTTGCC TTTTGGTAAT GCTGTTTCTG CTTAAGTTGQ CAAGACACTC CAAGTTT6GC 
ATGAAAG6CC CAGCCTCCGT TATCAGCaAT 6ATGATQACT CTGCCAGCCC ACTCCATCAC 
ATCTCCAATG GOAGTAACAC TOCATCTTCT TCGQAAGGT6 GCCCAGATGC TGTCATTATT 
GGAATGACCA AGATCCCTQT CATTGAAAAT CCCCAGTACT TTGGCATCAC CAACAGTCAG 
CTCAAGCCAG ACACATTTGT TCAGCACATC AAGC6ACATA ACATTGTTCT CftAAftG^AG 
CTAGGGQAAG GAGCCTTTOG AAAAGTGTTC CTAGCTGAAT GCTATAACCT CTOTCXTGAO 
CAfiGACAAGA TCTTOGTGGC AGTGAAGACC CTGAAGGATG CCAGTGACAA TGCACQCAAG 
QACTTCCACC GT6A6GCCQA GCTCCTGACC AACCTCCAGC ATGAGCACAT CGTCAAGTTC 
TATGGCGTCT GCGTGGAGGG CXSACCCCCTC ATCATGGTCT TTGAGTACAT 3AAGCATGGG 
GACCrCAACA AGTTCCTCAG 6GCACACGGC CCTGATGCOQ TGCTOATGGC TGMGGC^ 
CCGCCCAOSG AACTCACGCA GTOGCAGATO CTGCATATAG OCCAGCAGAT CGCCGO^C 
ATGGTCTACX: TCGCGTCCCA GCACTTCOTG CACCGC6ATT TGGCCACCAG GAACTGCCTG 
GTCGGGGAGA ACTTCCTGGT GAAAATCGGG GACTTTGGGA TGTCCCGGGA CGTGTACAGC 
ACTGACTACT ACAGGGTCGG TGGCCACACA ATGCTGCCCA TTCGCTGGAT GCCTCOVGAG 
AGCATCATQT ACAGQAAATT CACX3ACQGAA AfiOGAOGSTCT G6AGCCTGGQ GQTCGTGTTG 
TGGQAGATTT TCACCTATGG CAAACAGCOC TGOTACCAGC TCTCAAACaA TGAGCTGATA 
GAOTOTATCA CTCAGGGCCG AGTCCTGCAG CGACCCCGCA CGTGCCCCCA GGAGGTGTAT 
GAGCrOATCC TGGGGTGCTG GCAGCGAGAG CCCCACATQA GGAAGAACAT CAAGGGCATC 
CATACCCTCC TTCAGAACTT 6GCCAAGGCA TCTCX3QGTCT ACCTGGACAT TCTAGGCTAG 
6GCCCTTTTC CCCAGACCGA TCCTTCCCAA OSTACTCCTC AOACGGGCTG AGAGGJ^GAA 
CATCTTTTAA CTGCCGCTGG AGGCCACCAA GCTGCTCTCC TTCACTCTGA CAGTATTAAC 
ATCAAAGACT CCGAGAAGCT CTCGAGGGAA GCAGTGTGTA CTTCTTCATC CATAGACACA 
GTATTGACTT CTTTTTGGCA TTATCTCTTT CTCTCTTTCC ATCTCCCTTG GTTGTTCCTT 
TTTCTTTTTT TAAATTTTCT TTTTCTTCTT TTTTTTCGTC TTCCCTGCTT CACGATTCTT 
ACCCTTTCTT TTGAATCAAT CTQGCTTCTO CATTOCTATT AACTCTGCAT AGACAAAGGC 
CTTAACAAAC CTAATTTGTT ATATCftGCAG ACACTCCAGT TTGCCCACCA CAACTAACAA 
TGCCTTGTTG TATTCCTCCC TTTGATGTGG ATGAAAAAAA GGGAAAACAA ATATTTCACT 
TAAACTTTGT CACTTCTGCT GTACAGATAT CGAGAGTTTC TATGGATTCA CTTCTATTTA 
TTTATTATTA TTACTGTTCT TATTCTTTTT GGATGGCTTA AGCCTGTOTA TAAAAAAGAA 
AACTTG3OTT CaATCTGTGA AGCCTTTATC TATGGGAGAT TAAAACCAGA GAGAAAGAAG 
ATTTATTATG AACCGCAATA TGGGAGGAAC AAAGACAACC ACTGGGATCA GCTGGTGTCA 
GTCCCTACTT AG6AAATACT CAGCAACTGT TAGCTGGGAA GAATGTATTC GGCACCTTCC 
CCTGAGGACC TTTCTGAGGA GTAAAAAGAC TACTGGCCTC TGTGCCATOO ATGATTCTTT 
TCCCATCACC AGAAATQATA GCGTGCAGTA GAGAGCAAAG ATGGCTT 

Seq ID NO: S95 Protein sequence 
Protein Accession it NP_006171.1 
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1 

I 

MSSWIRWHGP 
NSVDPENITE 
NFTRNKLTSL 
SKNIPIiAHIiQ 
NETSHTQGSL 
WCIPFTVKGN 
LIAKNEYGKD 
DVTDKT6REH 



KRELGEGAFG 
VKFYGVCVEG 
AA6KVYLASQ 



11 

1 

AMARLWGFCn 
IFIANQKRLE 
SRKHFRHLDL 
IPNCQLFSAN 
RITNISSDDS 
PKPALQWFYN 
SKQISAHFM6 
LSVYAVWIA 
P8SSE6GFDA 
KVFLAECYNL 
DPLIMVPEYM 
HPVHRDLATR 



21 

I 

LWGFWRAAF 
IINEDDVEAY 
SELILVGNPF 
LAAFNLTVEE 
GKQISCVAEN 
GAIIjNESXYI 
WFGIODGANP 
SVVOFCI1I.VM 
VIIGMTKIPV 
CPEQDKILVA 
KHGDLNKPLR 
NCLVGSNLLV 



31 

I 

ACPTSCKCSA 
VGUINLTIVD 
TCSCDIMWIK 
GICSITLSCSV 
LVGEDQDSVN 
CTKIHVTNHT 
NYPDVIYEDY 
LFLLKX1ARE8 
ZENPQYFGIT 
VKTLKDASDN 
AHGPDAVLMA 
KIGDFGMSRD 



41 
I 

SRIWCSDPSP 
S6LKFVAHKA 
TLQEAKSSPD 
AGDPVPNMYW 
LTVHFAPTIT 
EYHGCLQIiDN 
GTAANDIGDT 
KFSMRGPASV 
NSQIiKPDTFV 
ARKDFKREAS 
EGNPPTELTQ 
VYSTDYYRVG 



51 

I 

GIVAFPRLEP 
FliKKSNLQHI 
TQDLYCLNES 
DVGNLVSKHM 
FLESPTSDHH 
PTHMNNGDYT 
TNRSNEIPS7 
ISNDDDSASP 
QaiKRHNIVL 
LIiTNLQHEKI 
SQKLHIAQQI 
GHTMLPIRWM 
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WO 02/086443 

PPESIMYRKP TTBSDVWSLG WLWEIFTYG KQPWYQLSNN EVIECITQGR VLQRPRTCPQ 
EVySLNLGCW QRBPHMXUCKI KGIEfTLLC9n< AKASFVYLDI L6 

8eg ID NOt 596 DtfA sequence 
Nucleic Acid Accession AF410a99 
Coding sequence I 483.. 2999 



PCT/US02/12476 



780 



1 

1 

GGGAGCAGGA 
GCAGQAGCCT 
TGCATACC3GG 
CC6CCX3GT06 
AG06GTAGCG 
CTGCCGGAAC 
CCGCAACAAG 
GTG6GGGAAA 
GGATGTCGTC 
GGCTGGTTGT 
CCTCTOGQAT 
CTAACAGTCT 
AAATCATCAA 
ATTCTGGATT 
TCAATTTTAC 
TGTCTQAACT 
AGACTCTCCA 
GCAGCAAGAA 
ATCTGGCOGC 
TGGCAGOTGA 
IGAATOAAAC 
GTQGGAAGGA 
ACCTCACTGT 
ACTGGTGCAT 
ACGGGGCAAT 
CXSGAGTACCA 
CTCTAATAGC 
GCTGGCCTGG 
ATGGAACTGC 
CAOACGTCAC 
OSTCTGTGGT 
CCAAGTTTGG 
GTGTTGGCCC 
TCTCCAATGG 
6AATGACCAA 
TCAAGCCAGA 
TAOGCQAAGG 
AGGACAAGAT 
ACTTCCACCQ 
ATGGGGTCTG 
ACCTCAACAA 
06C0CACGGA 
TGGTCTACCT 
TCGGGGAGAA 
CTGACTACTA 
GCATCATGTA 
GGGA6ATTTT 
AGTGTATCAC 
A6CTGATGCT 
ATACCCTCCT 
GCCCTTTTCC 
ATCnriTAAC 
TCAAAGACTC 
TATTGACTTC 
TTCTTTTTTT 
OCCTTTCTTT 
TTAACAAAC6 

AAACTTTGTC 
TTATTATTAT 
ACTTGTGTTC 
TTTATTATGA 
TCCCTACTTA 
CTGAG6ACCT 
CCCATCACCA 
ATGGCGCATA 
TTGTTTCTCA 
ATGTCCAQAQ 



11 

1 

GCCTCGCTGG 
GGACCCAGGC 
ACCCCCATTC 
GTGCCGGGCG 
CCCCCCTGTA 
ACTCTTCGCT 
CACCGAGGA6 
60GGCGGGTG 
CTGGATAAGG 
GGGCTTCTGG 
CTGGTGCAGC 
AGATCCTGA6 
CX2AAGATGAT 
AAAATTTGTQ 
COGAAACAAA 
GATCCTGOTG 
AOAOGCTAAA 
TATTCCCCTG 
ACCTAACCTC 
TC06GTTCCT 
AAOCCACACA 

GATcrcrrsT 

GCATTTTGCA 
TCCATTCACT 
ATTGAATGAG 
OGGCTGCCTC 
CAAGAATGAQ 
AATTGAOGAT 
AGCGAATGAC 
TCA TAAAA CC 
GGGATTnGC 
CATGAAA6AT 
AGCCTCOGTT 
GAGTAACACT 
GATCCCTGTC 
CACATTTGTT 
AGCCTTTQ6A 
CTTGGTGGCA 
TX3A06CCX3AG 
CX3TGGAGGGC 
GTT0CTC3V66 
ACTGAC6CAG 
GGCGTCCCAG 
CTTGCTGGTG 
CAGGGTOGGT 
CA6GAAATTC 
CACCTATG6C 
TCAGGGCCGA 
G6GGTGCTGG 
TCAGAACTTG 
CCAOACCGAT 
TGGOGCTGGA 
OGAGAAGCTC 
TTTTTGGCAT 
AAATTTTCTT 
TQAATCAATC 
TAATTT6TTA 
ATTCCTGCCT 
ACITCTGCT G 
TACTGTTCTT 
AATCTGTQAA 
ACX»CAATAT 
GGAAATACTC 
TTCTGAG6AG 
GAAATGATAG 
6T6TGCTCGG 
AGCGCTATCC 
CTCATTTCXaG 



21 
I 

CTGCTTOGCT 
GC06GCGGCX3 
GCATCTAACA 
OGCOGGGCCA 
AAG0G6TTGG 
COQOACCAGC 
TTAAGAGA6C 
CAGCGCGGGG 
TGGCATGGAC 
AG6GC0GCTT 
GACGCTTCrC 
AACATGAC30G 
GTTGAAGCTT 
GCTCATAAAG 
CTGACGAOTT 
GGCAATCCAT- 
TOCAGTCGAS 
GCAAACCTGC 
ACTGTGGAGG 
AATAT6TATT 
CAGGGCTCCT 
GTGG0G6AAA 
CCAACTATCA 
GTGAAAGGCA 
TCCAAATACA 
CAGCTGGATA 
TATGG6AAG6 
GGTGCAAACC 
ATOGGGGACA 
GGTOQGGAAC 
CTTTTG6TAA 
TTCTCATGGT 
ATCA GCAATG 
CCATCTTCTT 
ATTGAAAATC 
CAGCACATCA 
AAA6TGTTCC 
GTGAAGACCC 
CTCXrrGACCA 
GACCCCCTCA 
GGACAGGGCC 
T0GCAGAT6C 
CACTT0GT6C 
AAAATGGGGG 
GGCCACACSU^ 
ACGAOGGAAA 
AAACAGOCCT 
GTCCTGCAGC 
CAGOQAGAGC 
GCCAAGGCAT 
CCTTCCCAAC 
GGCCACCAAG 
TCGAGGGAAG 
TATCTCTTTC 
TTTCTTCTTT 
TGGCTTCTGC 
TATCAGCA6A 
TTGATGIOQA 
TACAGATATC 
ATTGTTTTTG 
QCCTTTATCT 
GGGAGGAACA 
AGCAACTQTT 
TAAAAA6ACT 
CGTGCAGTAG 
ACACASTTTT 
ACAGAACCTT 
G6TCAGGT66 



31 
I 

OGOGCTCTAC 
GGOGTGAGGC 
AQGAATCTGC 
TGCAGC6ACG 
CTATGCOQGG 
TCAGCCTCTG 
C6CAAGCGCA 
ACAGGCACTC 
CCGCCATGGC 
T06CCT6TCC 
CTG6CAT0GT 
AAATTTTCAT 
ATGTGGGACT 
CATTTCTQAA 
TGTCTAGQAA 
TTACATGCTC 
ACACTCAG6A 
AGATACCCAA 
AAGGAAAGTC 
GGGATGTTGG 
TAA66ATAAC 
ATCTTGTAGG 
CATTTCTCGA 
ACCCCAAACC 
TCTGTACTAA 
ATCCCACrCA 
ATGA6AAACA 
CAAATTATCC 
CCACQAACAG 
ATCTCTCGGT 
TGCTGTTTCT 
TTGGATTTGG 
ATGATGACTC 
CQGAA6GTGG 
CCCAGTACTT 
AGOQACATAA 
TAGCTQAATO 
TGAAGGATGC 
ACCTCCAGCA 
TCATGGTCTT 
CTGATGCOOT 
TGCATATA6C 
ACOGOGATTT 
ACTTTGGOAT 
TGCTGCCCAT 
OGQAOQTCTO 
GGTACCAGCT 
GACCCOGCAC 
CCCACATGAG 
CTCCGGTCTA 
GTACTCCTCA 
CTGCTCTCCT 
CA0TGT6TAC 
TCTCTTTCCA 
TTTTTOGTCT 
ATTACTATTA 
CACTCCAQTT 
TQAAAAAAAG 
GASAGTTTCT 
6ATGGCTTAA 
ATGGGAGATT 
AAGACAACCA 
AGCTGGGAAG 
ACTTGGOCTCT 
A6AGCAAAGA 
GTCTTOGTAG 
TOTCAACTTC 
GAAAGCC 



41 

I 

GCGCTCA6TC 
GC0GGA6CCC 
GCCCCAQAGA 
GCCGCCGCGG 
ACCACTGTGA 
ATAAGCTOGA 
GG6AAGGCCT 
GGGCTGGCAC 

GCxsGcrcroQ 

CAC3QTCCTGC 
6GCATTTCGB 
C3GCAAACCAG 
GAGAAATCTG 
AAACAQCAAC 
ACATTTCCGT 
CtGTGACATT 
TTTGTACTGC 
TTGTGGTTTG 
TATCACATTA 
TAACCTGGTT 
TAACATTTCA 
AGAAGATCAA 
ATCrCCAACC 
AOOGCTTCAG 
AATACATGTT 
CATOAACAAT 
GATTTCTGCT 
TGATOTAATT 
AAGTAATGAA 
CTATGCTGTG 
GCTTAAGTT6 
GAAAGTAAAA 
TGCCAOCCCA 
CCCAQATGCT 
TG6CATCAGC 
CATTQTTCTO 
CTATAAOCTC 
CAGTGAGAAT 
TGAGCACATC 
TGAGTACATG 
GCTGATGGCT 
CCA6CAGATC 
GGCCACCAGG 
GTCCCGGGAC 
TCGCTGGAT6 
GAGCCTGGG6 
GTCAAACAAT 

gtgcccccag 
gaagaacatc 
cx:tggacatt 
gacgggctga 

TCACTCTGAC 
TTCTTCATCC 
TCTCCCTTG6 
TCCCTGCTTC 
ACTCTGCATA 
TGCCCACCAC 
GGAAAACAAA 
ATGQATTCAC 
GCCT6TGTAT 
AAAACCAGAG 
CTGGGATCAG 
AATOTATTGO 
GTGOCATGGA 
TGOCTTCOQT 
aTTGTGATOA 
AGTTGAAAAG 



51 
I 

CCCGGG6GTA 
GGCCTC6AG6 
GTCCCGGACG 
AGCTCCX3AGC 
ACCCTGCCGC 
CTC3GGCAC6C 
CCCCSCACX30 
TGGCTGCTAO 
GGCTTCTGCT 
AAATGCAGTG 
AOATTGGAGC 
AAAAGGTTAG 
ACAATTGTGG 
CT6CAGCACA 
CACCTTGACT 
ATGT83ATCA 
CT6AAT6AAA 
CCATCTGCAA 
TCCTGTAGTO 
TCCAAACATA 
TCGGATGACA 
GATTCTGTCA 
TCAGACCACC 
TGGTTCTATA 
ACCAATCACA 
GGGGACTACA 
CACTTOVTGG 
TATGAA6ATT 
ATCCCTTCCA 
6TQGTGATTG 
GCAAGAOICT 
TCAAGACAA6 
CTCCATCACA 
GTCATTATTG 
AACAGTCA6C 
AAAAGG6AGC 
TGTGCTGAGC 
GCAOGCAAGG 
GTCAAGTTCT 
AAQCATGGGG 
GAGGGCAACC 
GC0G0G6GCA 
AACTGCCTGG 
GTGTACAC3CA 
CCTCCAGAGA 
GTOGTGTTGT 
GAGGTGATAG 
GAGGTGTAT6 
AAGGGCATCC 
CTAGGCTAGG 
GAGGATGAAC 
AGTATTAACS^ 
ATAGACACAG 
TTeTTOCTTT 
AOQATTCTTA 
GACAAAGGCC 
AACTAACAAT 
TATTTCACTT 
TTCTATTTAT 
AAAAAA6AAA 
AGAAAGAAGA 
CTGGTGTCAG 
GCACCTTCCC 
TGATTCTTTT 
6AGACACAA6 
TAGCACTGGT 
A6GT6GATTC 
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Seq ID KO: 597 Protein sequence 
Protein Accession AAL67965.1 



51 



1 U 21 31 41 

I 1 I I I 1 

M8SWIRWH0P AMARLWGFCH LWGFWRAAF ACPTSCKCSA SRZWCSDP8P GIVAPPRLEP 
NSVDPENITE IPIANQKRLE IINEDDVEAY VGLRNLTIVD SGLKPVAHKA FLKNSNWJHI 
NFTRNKLTSIi SRKHFRHLDL SELILVGNPP TCSCDIMWIK TLQBAKSSPD TQDLYOilJES 
SKNIPIANLQ IPNCGLPSAN LAAPNLTVEE GKSITIiSCSV AGDPVPNMYW DVGNI.VSKHM 



60 
120 
180 
240 



418 



wo 02/086443 

NETSHTQGSL RITNISSDDS GKQISCVAEM LVGEDQDSVN LTVHFAPTIT FIiBSPTSDHH 300 

WCIPFTVKOI PKPALQWPYN GAILNESKYI CTKIHVraHT BYHGCKJIiDlI PTHMNNGDYT 360 

IilAKMEYGKD EKQISAHFMG WPGIDDGANP NYPDVIYBDY GTAAHDIGDT TNRSNEIPST 420 

DVnnCTORBH LSVYAVWIA SWGFdiLVM I.FLLKLARHS KF®IKDPSWF GFGKVKSRQG 480 

VOPASVISND DDSASPLHHI SNGSNTPSSS BGGPDAVIIG MTKIPVIENP QYFGITNSQL S40 

KPDTFVQHIK RHNIVLKRBL GEGAFGKVPL AECYllIiCPEQ DKIIiVAVKTL KDASDNARKD 600 

PHREAELLTN IiQHEHIVKFY GVCVEGDPLI KVFEYHKHGD UIKPLRAHGP DAVLMAEOHP 660 

PTELTQSQML HIAQQIAAGM VYLASQHFVH RDLATBMCLV GBHLLVKIGD PGMSRDVnTST 720 

DYYRVQGHTM UPHttlMPPBS IMYRKFTTBS OVWSLGWUI BIFTYOKQPW YQLSNNEVIE 7B0 
CITQGRVLQR PRTCPQBVYB LMLGCWQREP HMRKNIKOIH TLLQRLAKAS PVYLDILO 

Seq ID NO: 598 DKA BCquence 
Nucleic Acid Accession AB052906 
Coding sequence t 74.. 814 

1 11 21 31 4X 51 

AAAACCTTGA GQTGATTCAT CTTCCAGGCT CTCCTTCCAT CAAGTCTCTC CTCCCTAQCO 60 

CTCTGGGTCC TTAATGGCAG CAGCCGCCGC TACCAAGATC CTTCTGTGCC TCCCGCTTCT 120 

GCTCCTGCTG TCCGGCTGGT CCCGGGCTCG GCGAGCOGAC CCTCACTCTC TTTGCTATGA 180 

CATCACCGTC ATCCCTAAGT TCAGACXTTOG ACCACGGTGQ TGTGC36GTTC AAGGCCAGGT 240 

GGATGAAAAG ACTTTTCTTC ACTATGACTG TGOCAACAAG ACAGTCACAC CTGTCAGTCX: 300 

CCTGGGGAAG AAACTAAATG TCACAACGOC CTGGAAAGOV CAGAACCC3M3 TACTGAGAQA 360 

GGTGGTGGAC ATACTTACAG AGCAACTGCG TGACATTCAG CTGGAGAATT ACACACCCAA 420 

GGAACCCCTC ACCCTGCAGG CCAGGATGTC TTGTGAGCAG AA AGCTG AAG GACACAGCaG 480 

TGGATCTTGG CAGTTCAGTT TCGATGGGCA GATCTTCCTC CTCTTTGACT CAGAGAAGAG 54 0 

AATGTG6ACA ACGGTTCATC CTQQAOCCAG AAAGATGAAA GAAAAGTGGG AGAATGACAA 600 

GGTTQTGQCX: ATQTCCTTCC ATTACTTCTC AATGOGAQAC TGTATAGGAT GGCTTGAGGA 660 

CTTCTTCATG GGCATGGACA GCACCCTGGA GCCAAGTGCA GGAGCACCAC TCGCCATGTC 720 

CrCAGGCACA ACCCAACTCA GGGCCACAGC CACCACCCTC ATCCTTTGCT GCCTCCTCaT 780 

CATCCTCCOC TCCTTCATCC TCCCTGGCAT CTGAGGAGAG TCCTTTAGAG TGACAGGTTA 840 

AAGCTOATAC CAAAA06CTC CTGTGAGCAC GGTCTTGATC AAACTCGCCC TTCTGTCTGG 900 

CCAGCTGCCC ACGACCTA06 OTGTATGTCC AGTGGCCTCC AGCAGATCAT GATGACATCA 960 

TGGACCCAAT AGCTCATTCA CTGCCTTGAT TCCTTTTGCC AACAArTTTA CCROCAGTTA 1020 

TACCTAACAT ATTATGCAAT TTTCTCTTGG TGCTACCTGA TGGAATTCCT GCACTTAAAO 1080 

TTCTGGCTGA CTAAACAAGA TATATCATTT TCTTTCTTCT CTTTrTGTTT GGAAAATCAA 1140 

GTACTTCTTT GAATGATGAT CTCTTTCTTG CAAATQATAT TGTCA GTAAA ATAATC3VCGT 12O0 

TAOACTTCAG ACCTCTGG06 ATTCTTTCCG TOTCCTGAAA GAGAATTTTT AAATTATTTA 1260 

AT3UU3AAAAA ATTTATATTA ATGATTGTTT CCTTTAGTAA TTTATTGTTC TCTACTGATA 1320 
TTTAftATAAA GAOTTCTATT TCCCAAAAAA AAAAAAAAAA AA 

Seq ID NO: S99 Protein sequence 
Protein AcceBSion #i BAB61048.1 

1 11 21 31 41 51 

MAAAAATKIL LCLPLLLLLS GWSRAGRADP HSIiCYDITVI PKPRPGPRHC AVQOQVDEKT 60 

FLHYXXXSJKT VTPVSPLGKK LNVTTAWKAQ NPVLRBWDI LTEQIiRDIQL EHYTPKEPLT 120 

WJAHMSCBQK AEGHSSGSWQ FSPDGQIPIjL PDSEKRMWTT VHPGARKHKE KWENDKWAM 180 

SFHYPSMQDC IGMLEDPLKG MDSTLEPSAG APLAMSSGTT QLRATATTLI LCCLLHIiPC 240 

FILPGI 



Seq ID NO: 600 DNA sequence 

Nucleic Acid Accession #i NN_001B98.1 

Coding sequence: 57. •482 

1 11 21 31 41 51 

GGCTCrCAOC CTCCTCTCCT GCAGCTCCAG CTTTGTGCTC TGCCTCTGAG GAGACCATGG 60 

CCCAGTATCT GAGTACCCTG CTGCTCCTOC TOGCCACCCT AGCTGTGGCC CTGGCCTGGA 120 

GCCCCAAGGA GGAGGATAGG ATAATCCOGG GTOOCATCTA TAACGCAGAC CTCAATGATG IBO 

AGTGGGTACA GCGTGCCCTT CACTTCGCCA TCAGCQAGTA TAACAAGGCC ACCAAAGATG 240 

ACTACTACAG ACGTCCGCTG CGG6TACTAA GAGCCAGGCA ACA6AC0GTT QGGGGGGTGA 300 

ATTACTTCTT CGAOGTAQAG GTGGGCCGCA CCATATGTAC CAAGTCCCAG CCCAACTTGG 360 

ACACCTOTOC CTTCCATOAA CRGCCAGAAC TGCAGAAGAA ACA6TTGTGC TCTTTCGAGA 420 

TCTAOGAAGT TOCCTGGQAG AACAGAAGGT CCCTGGTGAA ATCCAGGTGT CAAGAATCCT 480 

AGGGATCTGT GCCAGGCCAT TCGCACCAGC CACCACCCAC TCCCACCCCC TGTAGTGCTC 540 

CCACCCCTGG ACTGGTGGCC CCCACCCTGC GGGAGGCCTC CCCATGTGCC TGCGCCAAGA 600 

GACA6ACAGA GAAGGCTGCA GGAGTCCTTT GTTGCTCAGC AQGGCQCTCT OOCCTCCCTC 660 

CTTOCTTCTT GCTTCTAATA GCOCTGGTAC ATGGTACRCA CCCCCCCACC TCCTGCAATT 720 
AAACAGTAGC ATG6CC 



Seq ID NO: 601 Protein sequence 
Protein Accession #: NP^0018B9.1 

1 11 21 31 41 SI 

MAOYLSTLIiI. LIATLAVAIA WaPKEEDail PGOIYHAWW DBMVQHALHF AISEYMKATK 60 
DDYYRRPLRV LRARQQTVGG VMYPFDVBVG RTICTKSQPN tDTCAPHBQP EWKKQIiCSF 120 
EIYEVPWENR RSLVKSRCQE S 



Seq ID NO: 602 DNA sequence 

Nucleic Acid Accession fft NM_003976.2 

Coding sequence: 299^961 

1 11 21 31 

1111 



419 



wo 02/086443 

CTCTGAGCTT CTCTGAGCCT TGTTTGCTCA TCTGQAAAAA GGGGATTAAA CCATTTACXrr 60 

CAT6GAGTTG TBAAAGAATA GCTOCAAAGC ACCTAACACA TAGTAAGQTT CCCAGT6CA0 120 

CTACTTCTGC TOGOTTGAOT CTAQCTGTGT AGGCCCCTTG TTCCTCACCT GQAGAAACTG 180 

GGGTGGCAGG CCGGTCCCCC ACAAAAGATA ACTCATCTCT TAATTTQCAA GCTGCCTCAA 240 

CAGGAGGGTQ GGGGMCAGC TCAACAATCG CTGATGGGCG CTCCrGGTGT TCATAGAGAT 300 

QGAACrPGGA CTTGOAGGCC TCTCCAOGCT GTCCCACTGC CCCTGGCCTA GG0C3GCAGCC 360 

TOCCCTGTGG CCCACXXTGG CCGCTCTGGC TCTGCTGAGC AGOGTCGCAG AGGCCTCCCT 420 

OGGCTCCGOG CCCCGCAGCC CTGCCCCCCG CGAA6GCCCC CCGCCTGTCC TGGCGTCCCC 480 

0GCCG6CCAC CTGC06GGGG GAC6CA0GGC CC6CTGGTGC AGTGGAAGAG CC06GCG0GC 540 

GCCGCCGCAG CCTTCTOGGC CCGCGCCCCC GCOSCCTGCA CCCCCATCTQ CTCTTCCCCO 600 

OGGGGGCCGC GCGGCGCGG6 CTGGGGGCCC 6GGCAGCCGC GCTCGGGCAG CGGGGGCGCQ 660 

GGGCTGCOGC CTQCGCTCGC AGCTG6TGCC GQTGCX3CGCG CTCGGCXrrGQ GCCACCGCTC 720 

OGAOGAGCTG GTGCGTTTCC GCTTCTGCAG CGGCTCCTOC CGCC6CGCX3C GCTCTCCACA 780 

OGACCTCAGC CTGGCCAGCC TACTGG60SC C6SGSCCCTS OGACOGCOCC CQGGCTCCOS 840 

GCCCGTCAGC CAGCCCTGCT GCOOACCCAC GOQCTACOAA GGGGTCTCCT TCATGGIICGT 900 

CAACAQCACC TGGAQAACCG TGGACOOCCT CTCCGCCACC GCCTQCGGCT GCCTGG6CT0 960 

AGGGCTCGCT CCAGGGCTTT GCAGACTGGA CCCTTACOGG TGGCTCTTCC TGCCTGG6aC 1020 

CCTCCCGCAG AGTCCCACTA GCCAGCGGCC TCAGCX3«SGG AOQAAGGCCT CAAAGCTQAG 1080 

AGGCCCCTAC OGGTGGGTGA TGGATATCAT CCCOQAACAQ 6T6AAGGGAC AACTQACTAG 1140 

CAOCCCCAGA GCCXTTCACCC TG08GAT0CC AGCCTAAAAO ACACCAGAGA CCTC3l6CrA7 1300 

GGAGCCCTTC GOACCCACTT CTCACAGACT CTGGGACrGG CCAO6CCT0G AACXTPGGGAC 1260 

CCCTCXrrcrG ATGAACACTA CAGTGGCTGA GGCATCSISCC CCXSCCCAGG CCCTGTAOGG 1320 

ACAGCATTTG AAGGACACAT ATTGCAGTT6 CTTGGTTGAA AGTGCCTGTG CT6QAACTO0 1380 
CCTGTACTCA CTCATGGGAG CTGGCCCC 

5eq ID NOs 603 Protein sequence 

Protein Accession II i NP_003967.1 . . 

1 11 21 31 41 51 

t I - I ) i .1 

MELGLG6LST LSHCPWPRRQ PALnPTLAAL ALLSSVAEAS L6SAPRSFAP REGPPFVLAS 60 

PAGHIiPGGRT ARV7C5GRARR PPPQPSRPAP PPPAPPSALP RGGRAARA6G P68RARAAGA 120 

RGCRLR8QLV FVRALGLGHR SDELVRFRFC SGSCRRARSP HDZiSLASLLG AGAIiRPPPGS 180 
RPVSQPCCRP TRYEAVSFHD VKSTWRTVDR LSATAOGCLG 

Seq ZD MO: 604 DNA sequence 
Nucleic Acid Accession |i NM_057091.l 
Coding sequence I 783.. 1445 

1 11 21 31 41 51 

) I I I I I 

ACTGQCCX3CT GAGAGAAGAA T0G6GTGGA6 CAGA6AGCAG CTGCTGCAGG GCAGACA6CC 60 

OGACCCCCAA ATCTGCACGT ACCAGCAGTC AGCCGCCCCA CGCAGGGACC GGCTTACCCC 120 

TOGCTCCCOQ CCCTCACTCA CTTTCTCCCG CCCTOGGCCC GGCCTCCCAG CTCTCTACTT 180 

CQG8TGTCTA CAAACTCAAC TCCCGGTTTC OGTGCCTCTC CACGGCTOGA 6TTCTCTACT 240 

CTCCATATCC GAGGG6CCCC TCCCAOCATC TACGCm:TC CCAACCTCGG GGGACCTAGC 300 

CAAGCTAGGG GOGACTGGAT CCGA0GGGT6 GAGCAGCCAG GTGAGCCCCG AAAG6TGGGG 360 

OSGGGCAGGG GOGCTCCCAG CCCCACXXTOG GGATCTGGTG AOGCTGGGGC TGGAATTTGA 420 

CACC3GGA0GG CTGG6G0GGC GGGCAGGAG6 CTGCTGAGGG ATGGAOTTGG GCXX36GCCCC 480 

CAQACAAG6C CG6GGGGCTC 0GGCAGCA6C A6GTCCCTCG 6GCC0CA0CC CTOGCTGCCA 540 

COOGGGCXTFG GAGCCGCACA CCXX3AS6GT6 CAQACTGGCT 6CCAAGGCCA CACTTTTGGC 600 

TAAAAGAGGC ACTGCCA6GT GTACAGTCCT GGGCAT6CGC TGTTTGAGCT TOGGGGGAGA 660 

GCXXAGCACT 6GTCCCCGGA AAGGTGCCTA QAAGAACAAG GTGCA6QACC CCGTGCTGCX: 720 

TCAACAGGA0 GGTGGGGGAA CAGCTCAACA ATGGCTGATG GGCGCTCCTG GTGTTGATAG 780 

AGATGGAACT TGGACTTGGA GGCCTCTCCA OGCTGTCCCA CIGCCCCTGG CCTAGGCXKSC 840 

A0CCTGCCCT 0TG6C0CACC CTG6C06CTC 7GGCTCTGCT GAGCAGCGTC GCAGAGGCCT 900 

CCCTGGGCPC CGOGCCCOGC AGCCCTGCCC CCCGCGAAGG CCCCCCGCXT GTCCTGGCGT 960 

CCCCCGCCGG CCACCTGCCX3 GGGGGACGCA OGGCCCGCTG GTGCAGTGQA AGAGCCCX3GC 1020 

GGCCX3CCGCC GCAGCXTTTCT OGGCCCGCGC CCCCGCOGCC TGCACCCCCA TCTGCTCTTC 108O 

CCCGCGGGGG CCGCGCGGCG CX3GGCTGGGG GCCCGGGCAG CCGCGCTCGG GCAGCGGGGQ 1140 

OGCGGGGCTG COGCCTGCGC TC3GCAGCTGG TGCOGGTGCG CGCGCTCGGC CTGGQCCACC 1200 

GCTCCaSACGA GCIGGTGOGT TTCCX3CTTCT GCAGCGGCTC CTOCOSOOGC GCGCQCTCrC 1260 

CACACGACCT CAGCCTOGCC AGCC7ACTGG GCGCC6GGGC CCTG0QAC08 CCCCCQGGCT 1320 

CCOGGCCCGT CAGCCAGCCC TGCTGCCQAC CCACXJOGCTA CXSAAGGGGTC TCCTTCAT60 1380 

ACGTCAACAG CACCTGGAGA ACCQTGGACC QCCTCTCCGC CACCGCCTQC CCX:TGCXrrGG 1440 

GCTGAGGGCT CGCTCCAGGQ CTTTGCAGAC TGQACCCTTA COGQTGGCTC TTCCTGCCTG ISOO 

GGACCCTCCC GCAGAGTCCC ACTAGCCAGC GGCCTCAGCX: AGGGAOGAAG GCCTCAAAGC 1560 

TSAGAGGCCC CTACCX3GTGG GTSATGOATA TCATOOOCGA ACAGGTGAAG GGACAACT6A 1620 

CTAGCAGCCC CAGAGCCCTC ACCCTGCGGA TCCCAGCCTA AAAGACACCA GAGACCTCA6 1680 

CTATQGAGCC CTTCGGACCC ACTTCTCACA GACTCTGGCA CTGGCCAGGC CTCGAACCTG 1740 

GGACCCCTCC TCTGATGAAC ACTACAGTGG CTGAGGCATC AGCCCC06CC CAGGCCCTCT 1800 

AGGGACAGCA TTTGAAGGAC ACATATTGCA GTT6CTT6GT T6AAAGTGCC TGTGCTGGAA 1860 
CTGGCCTGTA CTCACTCATG GGAGCTGGCC CC 

Seq ID NO I 605 Protein sequence 
Protein Accession ttt l]p_003967.1 

1 11 21 31 41 51 

i I I I I I 

MELGIiGGLST LSHCPHPRRQ PALWPTLAAL ALIjSSVAEAS LGSAPRSPAP REGPPPVLAS 60 

PAGHLPGGRT ARHCSGRARR PPPQPSRPAP PPPAPPSALP RGGRAARAGG PGSRARAAGA 120 

RGCRZiRSQLV PVRALGLGHR SDELVRFRFC SGSCRRAR8P HDLSLASIiLG AGALRPPPGS 180 
RPVSQPCCRP TRYBAVSPHD VKSTHRTVDR bSATAGGCLO 



Seq ID NOt 606 DNA sequence 

Nucleic Acid Accession #i NM_057160.1 



420 



wo 02/086443 

Coding seguencei 1..714 



1 11 21 31 41 SI 

ATGCCCGGCC TGATCTCAGC CCGAGGACAG CCCCTCCTTG AGGTCCTTCC TCCCX»AGCC 60 

CACCTGQGTG CCCTCTTTCT CCCTGAGGCT CX»CrTGGTC TCTCCGCGCA GCCTGCCCTG 120 

TC3GCCC3VCX:C TGGCCGCTCT GGCTCTGCTG AGCftGOOTCO CRGAGGCCTC CCTQQ6CTCC 180 

GCGCCCCGCA GCCCTGCCCC CCGCC3AAGGC CXCCOGCCTG TCCTGGCGTC CCCOGCOGQC 240 

CACCTGCCGG GGGC5ACGCAC GGCCOGCTGG TGCAGTOGAA GAGCCOSGCG GCCGCCX3C06 300 

CAGCCTTCTC GGCCCGCGCE CCCGCCGCXTT GCACCCCCAT CTGCTCTTCC CCGCGGGGGC 360 

CGCGC3GGC6C GGGCTGGOGG CCCGOGCAGC CGOGCTCGGG CAGCGGGGGC GCGGGGCTGC 420 

CGCCTGCGCT CGCAQCTOOT 0CCG0TGCX3C GCGCTGOGCC TGGGCCACCG CTCGGRCXSAO 4B0 

CT6GT6CGTT TCCGCTTCTO CACSCGGCTCX T0CCX5CCGC0 CGCGCTCTCC ACACGACCTC 540 

AGCCTCGCCA GCCTACTGGG CGCCGGGGCC CTGCGACCGC CCCCGGGCTC CCX3GCCCGTC 600 

AGCCAGCCCT GCTGCCGACC CACGCGCTAC GAAGCGGTCT CCTTCATGGA CGTCAACAGC 660 

ACCTGGAQAA COGTGGACCG CCTCTCCGCC ACCGCCTGCG GCTGCCTGQG CTGAGGGCTC 720 

GCTCCRGGGC TTTOOWSACT 6GACCCTTAC CGGTGGCTCT TCCTOCCTGG GACCCTCCCG 780 

CaGAGTCCCA CTAGCCAGCG GCCTCAGCCA GG6ACGAAGG CCTCAAAGCT GAQAGGCCCC 840 

TACOSGTGGG TGATGGATAT CATCCCCGAA CAG6TGAAGG GACAACTGAC TAGCAGCCCC 900 

AGAGCCCTCA CCCTGCGGAT CCCAGCCTAA AAGACACCAG AGACCTCAGC TATGGAGCCC 960 

TTCGQACCCA CTTCTCACAG ACTCTGGCAC TGGCCAGGCC TOGAACCTGG GACCCCTCCT 1020 

CTOATOAACA CTACftGTGGC XQAGGCATCA GCCCCCSQCCC AOGCCCTQTA OGGACAGCAT 1080 

TTGAAQGACA CATATTGCAO TTGCTTOOTT GAAAOTOCCT OTGCTGOAAC TGQCCTGTAC 1140 
TCACTCATGG GAGCTGGCCC C 

Seq ID NO: 607 Protein eequence 
Protein Aeeesaion #s NP_476S0l.l 

1 11 21 31 41 51 

MPGLISARGQ PLLEVLPPQA HLGALFLPEA PLGLSAQPAIi HPTLAAIALI* SSVAEASL69 60 

APR8PAPREG PPPVIASPAG HLPGGRTARW CSGRARRPPP QPSRPAPPPP APPSALPRGG 120 

SAARAOGPOS HARAAGAHGC RLRSQLVPVR ALGLGHRSDE LVRPRFCSQS CRRARSPEDL 180 
SLASLLGAGA LRPPPGSRPV SQPCCRPTRY BAVSFMDVNS TWRTVDRLSA TACGCLQ 



Seq ID KO: 60 B DNA sequence 

Nucleic Acid Accession #t 1W_057090.1 

Coding sequence t 29.. 715 

I 11 21 31 41 51 

CTGATGGGCQ CTCCTGQTGT TGATAGRGAT GGAACTTGGA CTTGGAGGCC TCTCCACGCT 60 

OTCCCACTGC CCCTGGCCTA 6GCGGCAGQC TCCACTTGGT CTCTCCGCGC AGCCTGCCCT 120 

GTQGCCXACC CTGGCCGCTC TGGCTCTGCT GAGCAGCGTC GCAGAGGCCT CCCTGGGCTC 180 

CGOGCCCCGC AGCCCT6CCC CCXXK3GAAGG CCCCCC6CCT GTCCIGGGQT CXXXXS3C0QG 240 

CCACCTCCCG GGGGGACGCA CGGCCXXKTTG GTGCAQT6GA AOAGCCCOOC QQCCGCCX^C 300 

GCAGCCTTCT CGGCCCGCJGC CCCC6CCGCC TGCACCCCCA TCTGCTCTTC CCCGCGGGGG 360 

CCGCGCGGCG CGGGCTGGGG GCCCGOGCAG CCGCXSCTCGQ GCAGC3QGGGG aXXSCCGCTG 420 

CCGCXrrGCGC TCGCRGCTGG TGCCGGTQCG CGCGCTC6GC CTGGGCCACC GCTCCGACX3A 480 

GCTGGTOCGT TTCCGCTTCT GCAGCOQCTC CTOCCOCCQC OOGOGCTCTC CACACGACCT 540 

CAGCCTG6CC AGCXTTACTGO GCXSCOSGGQC CCTSCGACXXS CCCOCXSGGCT CCCGGCCCGT 600 

CAGCCA6CCC TGCTGCCGAC CCACGCGCTA CGAAGCGGTC TCCTTCATGG ACGTCAACAG 660 

CACCTGGAGA ACCGTGGACC GCCTCTCCGC CACCGCCTGC GGCTGCCTGG GCTGAGGGCT 720 

CXSCTCCAGGG CTTTGCAGAC TGGACCCTTA CCGGTGGCTC TTCCTGCCTG GGACCCTCX:C 780 

GCAGAGTOCC ACTAGCCaGC GGCCTCAeCC AGGGAOGAAG GCCTCSUUVGC TGAGAGGCCC 840 

CTACCX3GTG0 GTGATGGATA TCATCCXXX3A ACAOGTGAAO GGACAACTGA CTAGOMSCCC 900 

CAGAGCCCTC ACCCTGCGGA TCCCAGCCTA AAAGACACCA GAGACCTCAO CTATGGAQCC 960 

CTTCGGACCC ACTTCTCACA GACTCT6GCA CTGGCCAGGC CTCGAACCTG GGACCCCTCC 1020 

TCTGATGAAC ACTACAGTGG CTGAGGCATC AGCCCCCGCC CAGGCCCTGT AGGGACaOCA 1080 

TTTGAAGGAC ACATATTGCA GTTGCTTGGT TOAAAtSTGCC TOTGCTQOAA CTOGCCTCXA 1140 
CTCACTCATG GGAGCTGGCC CC 



Seq ID NO J 609 Protein sequence 
Protein Accession #; NP_476431.1 

1 11 21 

I I 1 

MEL6LQGLST LSHCPWPRRQ APLGLSAQPA 
QPPPVLASPA GHLPGGRTAR HCSGRARRPP 
SRARAAGARO CRLRSQLVFV RALGLGHRSD 
ALRPPP6SRP VSQPCCRPTR YBAVSFMDVN 



31 41 SI 

I I I 

X^WPTLAAUOi ItSSVAEASLG SAPRSPAPRB 60 
FQPSRPAPPP PAFP8ALPRG GRAARAG6PG 120 
ELVRFSFCSO 6CRRAR8PHD IiSLASLLGAG 180 
STWRTVDRLfi ATA06CL0 



Seq ID NO: 610 DNA sequence 

Nucleic Acid Accession Bos sequence 

Coding sequence: 1..1746 

1 11 21 31 41 51 

ATGCCACTCA LcATTATCT CCTTTTGCTG GTGGGCTGCC AAGCCTG6GG TGCAGGGTrG 60 

GCCTACCATG GCTGCCCTAG CGAGTGTACC TGCTCCAGGO CCTCCCAGGT GGA6TGCACC 120 

GGGGGACGCA TTGTGGCGGT GCCCACCCCT CTGCCCTGGA ACGCCATGAG CCTGCAGATC 180 

CTCAACAC6C ACATCACTGA ACTCAATGAG TCCCCGTTCC TCAATATCTC AGCCCTCATC 240 

GCCCTGASGA TTQACSAAOAA TGAGCTOTCO CGCATCACGC CrGGGGCCTT CCGAAACCTG 300 

GGCTCGCTOC GCTATCTCAG CCTOQCCAAC AACRAGCTGC AGGTTCTGCC CATCGGCCTC 360 

TTCCAGGGCC TGGACAGCCT TGAGTCTCTC CTTCTGTCCA GTAACCAGCT GTTGCAGATC 420 

CAGCCGGCCC ACTTCTCCCA GTGCAGCAAC CTCAAGGAGC TGGAGTTGCA CGGCAACCAC 480 

CTGGAATACA TCCCTGACGG AGCCTTC6AC CACCTGGTAG 6ACTCA0QRA GCTCAATCTO 540 



421 



wo 02/086443 

QGCAAGAATA GCCTCACCCA CATCTCACCC AOGGTCTTCC AGCACCT6QG CAATCTCCAO 600. 

GTCCTCCOGC TOTATGAGAA CAGGCTCAQG GATATCCCCA TGGGCACTTT TGATGGGCTT 660 

QTTAACCTGC AGGAACTGGC TCTACAGCAG AACCAGATTG GACTGCTCTC CCCrGGTCTC 720 

TTCCACAACA ACCACAACCT CCAGAOACTC TACCTC3TCCA ACAACCACAT CTCCCAGCTG 780 

CCACCCAGCA TCTTCATGCA GCTGCCCCAG CTCAACOGTC TTACTCTCTT TGGQAATTGC 840 

CTGAAOGAGC TCTCTCTGGG GATCTTOGGG CCCATGCCCA ACCTGCGGGA GCTTTGGCTC 900 

TATGACAACC ACATCTCTTC TCTACCCGAC AATGTCTTCA GCAACCTCCQ CCAGTTGCAG 960 

GTCCTGATTC TTAGCCGCAA TCAGATCAGC TTCATCTCCC CGGGTGCCTT CAACX3GQCTA 1020 

ACGGAGCTTC GGGAGCTGTC CCTCCACACC AACGCACTGC AGGACCTGGA OSGQAATOTC 1080 

TTCCQCATGT TGGCCAACCT GCAGAACATC TCCCTGCAQA ACAATCGCCT CAGACAGCTC 1140 

CCAGG6AATA TCTTCGCCAA CGTCAATGGC CTCATG60CA TCCAGCTGCA OAACAACCAO 1200 

CTGGAGAACT TGCXICCTOGG CATCTTOGAT CACCTGGGGA AACTOTOTOA OCTOOGGCTG 1260. 

TATGACAATC CCTGGAGGTG TGACTCAGAC ATCCTTCOSC TCOGCAACTQ GCTCCTaCTC 1320 

AACCAGCCTA GGTTAGGQAC 6GACACTGTA CCTGTGTGTT TCAGCCCA6C CAATQTCGGA 1380 

GGCCAGTCCC TCATTATCAT CAATOTCAAC anOCIOTTC CAAGCQTOCA TGTC0CT6AG 1440 

GTGCCTAGTT ACXTCAGAAAC ACCATGGTAC CCAQACACAC CCAGTTACCC TGACACCACA 1500 

TCOGTCTCTT CTACCACTOA GCTAACCAGC CCTGTGGAAG ACTACACTGA TCTGACTACC 1S60 

ATTCAGQTCA CTGATGACCG CAGCX5TTTGG GGCATGACCC AGGCCCAGAQ OGGGCTGGCC 1620 

ATTGCCGCXa TTGTAATTGO CATTGTOGCC CTGOCCTGCT CCCTOGCTGC CT60GTCGGC 1680 

TGTTGCT6CT GCAAGAAGA6 GAGCCAAGCT GTCCTGATGC AGAT6AAG6C ACXTCAATGAG 1740 

TGTTAAAQAG GCAGGCTGGA GCAGGGCTGG GGAATGATGG GACTGGAGGA CCTGGQAATT 1800 

TCATCTTTCT GCCTCCACCC CTGGGTCCAT GGAGCTTTCC OGTGATTGCT CTTTCTGGCC 1860 

CTAGATAAAG GTGTGCCTAC CTCTTCCTC5A CTTGCXTPGAT TCTCCOGTAG AGAAGCAGGT 1920 

OGTGCOGGAC CTTCCTACAA TCAGGAAGAT AGATCCAACT G6CCATG0CA AAAQOCCTGG 1980 

GGATTTCOSA TTCATACCCC TGOGCTTCCT TOQAGAGGGC TCTTCCTCCA AATCCTCCCC 2040 

AOCTGTCCTC CAAGAACAGC CTTCCCTGOG CCCAGGCCCC CTCCQGGCCT CTOTAGACTC 2100 

AGTTAGTCCA CAGCCTGCTC ACTTCGTGGO AATAGTTCTC CQCTGAOATA GCCCCTCTCG 2160 

CCTAAGTATT ATGTAAGTTG ATTTCCCTTC TTTTGTTTCT CTTGTTTGTG CTATGGCTTG 2220 

ACXX»GCATG TCCCCTCaAA TOAAAGTTCT CCCCTTGATT TTCTGCTCCT GAAGGCAGGG 2280 

TGAGTPCTCT 0CTCAAAI3AA GACTTCAAAC CATTTAACTG OTTTCTTAAO AGCX3GTC3UIT 2340 

CAGCCTGGTT TTGGGGAT6C TATGAAAGAG AOAAGG AAAA tCATGCCGCT CAGTTCCTGQ 2400 

AQACAQAAGA GCOGTCATCA GTGTCTCACT TGTGATTTTT ATCTGGAAAA GGAAGAAACA 2460 

CCCCAGCACA GCAAGCTCAG CCTTTTAGAG AAGGATATTT CXaJUVCTGCA AACTTTGCTT 2520 

TGAAAAGTTT AGCCCTTTAA GGAATGAAAT CATGTAOAAT TTTG6ACTTC TAAAAACATT 2S80 

AAAATCAOCT TATTAATACG GGATAGAGAA AGAAATCTQG TO0CI6QG60 TCCCT6TGTT 2640 

CACCCCTA6A GTTTGTTTtA AAATTTTTAA TTGAAGCATG TGAAGTOTAC 5T6CA6AAAA 2700 

GTGGGAACAT GATAGTGTAT GGCTTGGTG6 ATTTTCACAA ACTQAACATA CCTGTGTAAT 2760 

CAGCATCTAG ACCCAGACCC AGAGCATCAC AAATATOXC CATCXITGGGC TTTTCCCAGA 2820 

GGAGATGGGG GCTTCTGAAG ATGQACTTAC CTGGGAOCTG COXaXATGA GCCAGGAOGQ 2880 

TCOCCGCACA GTCAOCCTGT GCAAAGQCCC G0T6QGCAGQ G6TGGAG6A6 AATATGTGGG 2940 

TGTGGACAGG ATGGGAQACT GTGGCCTGAA CAOQAGATTT TATTAtATCT GGAGACCCTQ 3000 

ACSAQACCCTG A6ACCTGQGG CAOCATGGCT GGCCAGGTCA GAAGCATCCT GACTGCAGAQ 3060 

GTCCGTGCAG CCACACCXTTC TTCCCT6CCA GCAAGTTGTC TGCGGCTCAT CGGAGGCCCC 3120 

TCOGCCTGGA OCCTTCTATG QAOQTOATAT GCCTGTATCT GTTTTTAATT TTCATTCTTC 3180 

ACTTAGGSSA AlQTGAAATOa CTCAOAOATG AGATCCTTTA ATTGAAAACG AAGTGTAACG 3240 

GAATCTAOTG TC f TTCT A AT OTOOTAAAAT TCTCCATCAA CATCACAGTC AGCTGGCAGC 3300 

TOAACTTCAG AATCTCACTT ACAGCAGGCX5 ACACGGGG6T ACACOGATGG GTCACACTGQ 3360 

GTCTGGGGGC TCCCTGGAGC TCCTCCTGOG TGTGGTCTGG TTAGGAGTTG AGTTG TTTG C 3420 

TCCAGGGTTA TTCTGCTCCT CGAGTCACAG TCACAOGAAT ACCTGCCTTC TCTGGCTTTC 3480 

CTGCTATACA CATATTCACA TGGOGCTCAA GAAGTTAGGC TCATGGCAAC GTOTGTCTTT 3540 

CTCTGGACAA CTGGCGCAGT TTACAGTGAA AT6QAGAATT TCAGGTCTCC ACGTCTGCCC 3600 

AG6AAAGAAC TTCAOCTGAC TCCACGGGGA TCTGGAAATC CAOGACCAAT CCCGA TOGG C 3660 

TCTTATTAGC TCCCCGCTCC ACAAGACACC TGTGCTTrGO AAATCCACCA CCAATCCOQA 3720 

TOGGCTCTTA TTAGCTCCCC GCTCCACAAG ACACCTGTGA TCTGGAAATC TACCACCAAT 3780 

CCCGATCGGC TCTTATTAGC TCCCCGCTCC ACAAGACACC TGTGACATCC TCCAGGGCCA 3840 

CAGGAGCACG TGCTGACCAG TTTTCCCTTC CAGTTCCTGC ACAAAAAGTG TCCAGAGGGC 3900 

TGTTTGCAAA CACTAGTGCA CTTTGTAGCT TTTCACCCTC TGTCOCAGGG AATCTAGGA6 3960 

AGATGAGOCC CGTCAGAQTC AAGAGATGTC ATCCCCCCAO GGTCTCCAAG GCATTTCCAC 4020 

ACTATTGGTG GCACCTGGAG GACATGCACC AAGGCTT6CC AGAGCCAACA GGAAGTGAGC 4080 

CCAGAGCATG GCACATGAGC ATCACCOGCT GATGGTGGCC TGCTGTGCCT QGTQCCAACA 4140 

GGGGCATCCC GGCCCGTACC CCTCCAGACA GGAAGCATGG GTTTGCCCAC AQACCTGTCQ 4200 

GGTGCTCCTG TGAGTGGCCT CCAGATGTCI TTQTGCATA6 0CACAAGT6G GCCAGGGCTQ 4260 

GAGSSAGGTG OGAAAOCTCA TCATCOGQIG G6CGCIGGCA AlCTTAAOCC A6AAOCCTTA 4320 

GGTATTCCTG GCAGTA6CCA TGAGATTQGA GCACCTTCCT CTCCAGOCAG AGGCTGAOCT 4380 

GAGGGCCACT GTCCTCAGAT GACACCACCC AGQAGCACCC TAGGTGAGGG GT6AGGGCCC 4440 

CCTTAT6TQA ACCTCTTQCC TCTTCCTTTC TCCCATCAGA GTGGTTG6AT GGAGCCATTG 4500 

GCCTCCTTTT CTTCAGCGGG CCCTTCAACC TCTCTGCACC ATGTTGTCTG GCTGAG6AGC 4560 

TACEAGAAAA GCTGAOTGOA GTCTCCTTTC CAACAGGATG ATOCATTTGC TCAAT TCTCA 4620 

GG 6CTGGAAT GAGCOGGCTG GTCCCCCAGA AAGCTG6AGT GGGGTACAQA 6TTCAOTTTT 4680 

CCTCTCTGTT TACAGCTCCT TQACAGTCCC ACGCCCATCT GGAGTQGGAG CTGOGAGTTA 4740 

GTGTTGGAGA AGAAACAACA AAAGCCAATT AGAACCACTA TTTTTAAAAA GTGCTTACTG 4800 

TGCACAGATA CTCTTCAAGC ACTGGAOGTG GATTCTCTCT CTAGGCCTCA GCAGCCCTGC 4860 

G6TAGGAGTG CCGCCTCTAC CCACTTGTGA TGGGGTACAS AGGCACTTQC TCTTCIGCAT 4920 

GGTGTTCAAT AGGCZXjGGAG TTTTATTTAT CTCTTCAAAC TTTOTAGAAO AGCTCATGOC 4980 

TTGTCTTGGG CTTTCGTCAT TAAACCAAAG GAAATGGAAG CCATTCCCCT GTTGCTCTCC 5040 

TTAGTCTTGG TCATCAGAAC CTCACTTQGT ACCATATAQA TCAAAAGCTT TGTAACCACA 5100 

GQAAAAAATA AACTCTTCCA TCCCTTAAAG AATAGAATAG TTTGTCCCTC TCATGGGAAT 5160 

TGGQCTGTAT GTATATTGTT CTrCC'f C Crf A6AATTTAGA GATACAAGA6 TTCTACTTAG S220 

AACTTTTCAT GGACACAATT TCCACAACCT TTCAGATGCT GATGTAGAGC TATTGGGAAA 5280 

GAACTTCCAA ACTCAGGAAG TTTGCAGAGA GCAGACAGCT AGAGATAACT CGGGACCCAG 5340 

AGTTGGTCGA CAGATGTTAQ ATGTATCCTA GCTTTTAGCC ATAAACCACT CAAAQATTCA 5400 

GCCGCCAGAT CCCACAGTCA GAACIGAATC TGCGTTGTTG GGAAGCCAGC AGTGGCCTTQ 5460 

GGAAGGAAGC CATGOCTGTG OTTCAGAGAG GGTGGGCTGG CAAGCCACTT COQGGGAAAA 5520 

CTCCTTCOGC CCCAGGTTTC TTCTTCTCTT AA66AGAQAT TGTTCTCACC AACCCGCTGC 5580 

CTTCATGCTG CCTTCAAAGC TAGATCATGT TTOCCTTQCT TAGAGAATTA CTGCAAATCA 5640 

QCCCCAGTGC TTGGCGATGC ATTTACAQAT TTCTAGGCCC TCAGGGTTTT GTAGAGTGTG 5700 

AGCCCTGGTG GGCAGGGTTG GGGGGTCTQT CTTCTGCTGG ATGCTOCTTG TAATCCATTT 5760 
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GGTGTACAC5A ATCAACAATA AATAATATAC ATQTAT 

Seq ID NO I 611 Protein sequence 
Protein Acceesion #t BAB84587.1 



PCTAJS02/12476 



MPLKHYLLLL 
LNTHITELNB 
FQGLDSLESL 
GKNSLTHISP 
FHNNHNLQRL 
YDNHISSLPO 
FRMLAHLQNI 
YDNPWRCDSD 
VPSYPETPWY 
lAAIVIGIVA 



11 

1 

VGC0AH6AGL 
SPFLHISAIil 
LLSSNQLIiQI 
RVFQHLOJUJ 
YLSNNHISQL 
MVFSNLRQIiQ 
SLONNRIAQL 
ILPLRNgiLL 
PDTPSYPDTT 
LACSIAACVG 



21 

I 

AYHGCPSBCT 
ALRIEKMELS 
QPAHFSQCSN 
VLRLYENRLT 
PPSIFMQLPQ 
VLILSRKQIS 
PGtflFANVNG 
NQPRL6TOTV 
SV6STTGLTS 
CCOGKKRSQA 



31 

I 

CSRASQVECT 
RITPGAFRNL 
LKSIiQIiHGMH 
DIPMGTFDGL 
IjNRLTLFGNS 
PISPGAFNGIi 
LMAIQLQNNQ 
PVCPSPANVR 
PVSDYIDLTT 
VLHQHKAFNB 



41 

1 

GARIVAVPTP 
GSIjRYLSIiAN 
LEYIPDGAPD 
VNLQELALQQ 
LKELSLGIFG 
TBIiRELSIiHT 
LENLPLGIFD 
GQSLIIINVN 
IQVTDDRSVW 
C 



51 

1- 

LPNNAMSIiQI 
NKLQVLPIGL 
HLVGLTKLNL 
NQIGLLSPGL 
PMPNLREUfL 
NALQDLIXaiV 
HLGKLCELRL 
VAVPSVHVPE 
GMTQAQSGLA 



60 
120 
180 
240 
300 
360 
420 
4S0 
540 



Seq ID MO I 612 DNA sequence 
iiucleic Acid AccesBion »: XH_098151 
Coding sequence: 1.^447 



ATGATGCATT 
AGTGGGGTAC 
TCTG6AGTGG 
CTATTTTTAA 
TCTCTAGCCC 
CAGAGGCACT 
AACTTTGTAC 
AA6CCATTCC 



II 

! 

TGCTCAATTC 
AGAGTTCAGT 
GAGCTGGQAO 
AAAGTGCTTA 
TCAGCACCCC 
TGCTCTTCTG 
AA6AGCTCAT 
CCTGTTGCTC 



21 
1 

TCAQGGCTGQ 
TTTCCTCTCT 
TGACSTGTTGO 
CTGTGCACAG 
TGOGGTAGGA 
CATGGTGTTC 
GGCTTGTCTT 
TCCTTAG 



31 

! 

AATGAGCCGG 
GTTTACAGCT 
AQAA6AAACA 
ATACTCTTCA 
GTGCCGCCTC 
AATAGGCTGG 
GGGCTTTCGT 



41 

I 

CTGGTCCCCC 
CCTTGAGAGT 
ACAAAAGCCA 
AGCACT66AC 
TACCCACTTG 
GAGTTTTATT 
CATTAAACCA 



51 
I 

AGAAAGCTGG 
CCCACGCCCA 
ATTAGAACCA 
GTGGATTCTC 
TGATGGGGTA 
TATCTCTTCA 
AAGGAAATGG 



Seq ID HOt 613 Protein sequence 
protein Accession «: XPJ)98151 



31 



41 



51 



1 11 21 ^ ^ J 

ilMHLLNSQGW NEPAGPPBSW SGVQSSVFIiS VYSSLTVPRP SGVGAGSQCW RRNKKSQLEP 
LPLKSAYCAQ ILPKHWTWIL SLALSTPAVG VPPLPTCDOV QRHUUPCMVP HRLGVLFISS 
HFVQBLMACL GLSSLNQRKW KPFPCCSP 

Seq ID HOt 614 DNA sequence 

Nucleic Acid Accession #: MM_002658.l 

Coding sequence: 77.. 1372 



GTCCC06CAG 
CCCCGACCTC 
GAG06ACTCC 
TGGAGGAACA 
GAAATTCGGA 
TCACTTTTAC 
CTCTGCCACT 
CCTGGGQAAA 
GCAGGTGGGC 
AAAGCCCTCC 
CCGCTTTAAG 
CATCTACAGG 
CCCTTGCTGG 
CATOGTCTAC 
GGTGGAAAAC 
CATTGCCTTG 
ACAGACCATC 
CACTGGCTTT 
TGTTGTGAAG 
CACCACCAAA 
CTCAGGGGGA 
CTGGGGCCGT 
CTTACCCTGG 
AGGGAGGAAA 
TCCATCAGCT 
CACCACCAGG 
CAQACCCTCT 
TGTCTTTTTC 
GGCTCGAAGG 
AATQAATAAT 
AATGT6GGAG 
ATTCCAT6AA 
GCTGTGAGTG 
AAACTGTGT6 
CTGGGGCCTC 
ACCTGTGACC 
ATCCCTTCCT 
ACACTGAATA 
ATCAATAAAA 



11 
i 

C6CCGTCGC6 
GCCACCAT6A 
AAAGGCAGCA 
TGTGTGTCCA 
GGGCAGCACT 
0GAGGAAA6G 
GTCCTTCAGC 
CATAATTACT 
CTAAAGCCGC 
TCTCCTCCAG 
ATTATTGGGG 
AG6CACCGGG 
GTGATCAGCG 
CTGGGTCGCT 
CrCATCCTAC 
CT6AAGATCC 
TGCCTGCCCT 
GGAAAAGAGA 
CTOATTTCCC 
ATGCTATGTG 
CCCCTCGTCT 
GOATGTGCCC 
ATC^SCAGTC 
CGGGCACCAC 
GTAAGAAGAG 
GTGAAC3GACA 
GGCCAGGATG 
TGGACTGAAG 
GA6AGCCAGC 
TTCCCAATTA 
CAQCGGTTTG 
TGTATCAGGA 
TAAGTGTGAG 
GACTGTGATG 
TTGGGTCOCC 
AGCACTGTCT 
TTTAGCCTA6 
TTTATATTTC 
TGTGATTTTT 



21 



31 



COCTCCTGCC 
OAGCCCTGCT 
ATGAACTTCA 
ACAAGTACTT 
GTGAAATAGA 
CCAGCACTGA 
AAAGOTACCA 
GCAGGAACCC 
TTGTCCAAGA 
AAGAATTAAA 
GA6AATTCAC 
GGGGCTCTGT 
CCACACACTG 
CAAGGCTTAA 
ACAAG6ACTA 
GTTCCAAGGA 
CGATGTATAA 
ATTCTAC06A 
ACCGGGAQTG 
CTGCTGACCC 
GTTCCCTCCA 
TGAAGGACAA 
ACACCAAGGA 
CCGCTTTCTT 
ACTGG6AAGA 
ATAGCTTTAC 
GAGGGGTGGT 
CCTGCAGGAG 
TCCCOCGACC 
GGAAGTGTAA 
GGGAGCAGAG 
AATATATATG 
TAAGAGCTGG 
CCACACAGAG 
CAOGTGACAG 
CAGTTTCACT 
TTCATCCAAT 
ACTATTTTTA 
CTGA 



GCAG6CCACC 
OGOGCGCCTG 
TCAAGTTCCA 
CTCCAACATT 
TAAGTCAAAA 
CACCATGGGC 
TGCCCACAOA 
AGACAACCG6 
GTGCATGGTG 
ATTTCAGTGT 
CACCAT OGAG 
CACCTACGTO 
CTTCATTGAT 
CTCCAACACG 
CAGCGCTGAC 
GGGCAGGTGT 
G6ATCCCCAG 
CTATCTCTAT 
TCAGCA6CCC 
CCAATG6AAA 
AGGGCGCATG 
GCCAGGCGTC 
AGAGAATGGC 
GCTGGTTGTC 
TAGGCTCTGC 
CCTCAOGGAT 
CCTGACTCAA 
TTAAAAAGGO 
GGTGGGCATT 
GCAGCTGAGG 
ACACTAACGA 
TGTGTGTATG 
TGTCTGATTG 
TGGTCTTTCr 
T6CCTGG6AA 
TTGAGATAOA 
CCTCACTGGG 
TTTAIEATTTT 



41 

I 

GAGGCC6CCG 
CTTCTCTGOQ 
T0QAACT6TG 
CACTGGTGCA 
ACCTGCTATG 
OGGCCCTGCC 
TCTGATGCTC 
AGGGGACCCT 
CATGACTGCG 
GGCCAAAAGA 
AACCAGCCCT 
TGT6GAG6CA 
TACCCAAA6A 
CAA6GGGAGA 
ACGCTTGCTC 
GOGCAGCCAT 
TTTGGCACAA 
CCGGAGCAGC 
CACTACTAC6 
ACAGATTCCT 
ACTTT6ACTG 
TACACGAGAG 
CTGGCCCTCT 
ATTTTTGCAG 
ACAGATGGAT 
AGGCCTGGGT 
CATGTTACTG 
CAGGGCATCT 
TGTGAGGCCC 
TCTCTTQAGG 
CTTCAGGGCA 
TTTGCACACT 
TTAAGTCTAA 
GGAGAGGTTA 
TGTACTTATr 
tGTCCCTTTC 
TGGGG TGAGG 
TOTAATTTEA 



51 



60 
120 
180 
240 
300 
360 
420 



60 
120 



CCGTCTAGCG 
TCCTGGTCGT 
ACTGTCTAAA 
ACTGCCCAAA 
AGGGGAATGG 
TGCCCTGGAA 
7TCAGCTGGG 
GGTGCTATGT 
CAGATGGAAA 
CTCTGAGGCC 
GGTTTGCGGC 
GCCTCATCAG 
AGGAGGACTA 
TGAA0TTT6A 
ACCACM^CGA 
CCCGGACTAT 
GCTGTGAGAT 
TGAAAATGAC 
GCTCTGAAGT 
GCCAGGOAGA 
6AATTGTGAG 
TCTCACACTT 
QAGGGTCCCC 
TAGAGTCATC 
TTGCCTGTGG 
GCTGGCTGCC 
ACCAGCAACT 
CCTGTGCATG 
ATGGTTGAGA 
6AGCTTAGCC 
GGGCTCTGAT 
TGTTGTGTGQ 
ATATTTCCTT 
TAGGTCACTC 
CTGCAGCATG 
TTGGCCWSTT 
ACCACTCCTT 
AATAAAAGT6 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
640 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1660 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 



423 



wo 02/086443 



PCTAJS02/12476 



Seq ID KO: 615 Protein sequence 
Protein Accesaion II i NP_002649.1 



5 

10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



1 

I 

MRAUARLLL 
HCBIDKSKTC 
YCRNFDtniRR 
OSEFTTIENQ 
RSSUiSMTOQ 
FSmHOPQEO 
CAADPQWKTD 
SHTKEQIGLA 



11 

I 

CVLWSDSKO 
YBBBBSFYHa 
PWCTVQVGtiK 
PHFAAIVKSH 
BQCFGVEHLI 
TSCBITOFGK 
SCQSD30GPL 
b 



21 

I 



KASTD1HGRP 
PLVQECMVED 
RQGSVTYVOG 
LHKDYSADTL 
EHSTDYLYPE 
VCSLQGRMTL 



31 

I 

CDCLHGGTCV 
GLPflNSATVL 
CADGKKPSSP 
GSLISPCWVI 
AHENDXALLK 
QLKMTWKLI 
TGIVSHQRGC 



41 

I 

SNKYFSNIHW 
QQTYHAHRSO 
PEELKFQCGQ 
fiATHCPIDYP 
IRSKEC3RCAQ 
SHREOQQPHY 
ALKDKPGVYT 



51 

1 

QICPKKFGQQ 
ALQLGLGKHN 
KTIiRPRFKIX 
XKEDYZVYLG 
PSRTIQTICL 
YQSEVTTKMI* 
RVSKFLPWIR 



Seq ID NO: 616 DKA sequence 

Nucleic Acid Accession ff: NH_024422.1 

Coding sequencet 202.. 2907 



G6CCAAAGGA 
CTCTCCGCGC 
GCTCCGGCCX5 
GACCTGCXXX: 
CTCTOCCGGC 
AATOTGACAT 
CTGAAA6A6T 
TTGGAGGATG 
TTTACCATAT 
GAGCATCAAA 
AAGAGAAGAT 
CTTTTCCTTC 
AGAGGTCCTG 
AACTTGTATT 
TTTGCAACAA 
GAGGATGAAA 
GAAAATTGCA 
GACACXSATGC 
CIATTTTCTA 
GAGTTAATTQ 
GGTCTACAGA 
ACATTTACTC 
TTACGAGTTA 
ACCATTTTAA 
GJVAGGAGTTC 
CAAATTGGTO 
AGCACAGCAA 
CCAATACAOA 
AAAGCATATG 
CCAACA06GT 
GATAGAGAGG 
CAA6GAGGGA 
AGCCCATTCA 
ATTGTT6CGQ 
AGTTCTACTT 
OQTCTTTCCT 
GATAGACTTG 
6AAAATGACT 
AAGTGGGOCA 
CTGGTCTGTO 
CAGCAGAACC 
AATGGCTTCA 
TCAGGAATCA 
TOQGAATCCT 
ACX3GAGGTGG 
CTTGGTGAAA 
GTCCTGACAT 
GAACGACAA6 
CTAGCAGAAG 
TTTATGACTT 
TGGGGGTTTT 
AGACACTATA 
TCTATCCAAG 
ATGACAACAG 
TATTTGAAGC 
AATTAAGTGT 
ACTGCATTCT 
TTTCTAGCCA 



11 

I 

AAAGCCCCTT 
GCCCCACXTTC 
CQGCCCTCBC 
GAGCCCTCTC 
TGCTCCTGCT 
TACATGTTCC 
GCTTTACAGC 
GTTCAGTCTA 
TACTTTCCAA 
CAAAGGTCCT 
GGGCTCCAAT 
AACAGGTTCA 
GAOTTQACCA 
OTACTGQTCC 
CTCCAQAT66 
ATGATAACTA 
GAG7t36GCAC 
ACACACGCCT 
TGCATGCAAC 
ACAAGTACCA 
CAACTTCAAC 
GTACTTCTTA 
CTGTTGAGGA 
AGG6CAATGA 
TTTQTGTAGT 
TAGTTAATGA 
CAGTTACTGT 
CTGTTCGCAT 
AOCCAGAAAC 
GGGTCACCAT 
CAGAGACCAT 
GAACATGTAC 
TACCTAAAAA 
TTQATOCTGA 
CAGAAGTACA 
ATCAGAATGA 
GCATGTCTAG 
GCACACATOG 
TCCTTQCAAT 
GGSCTTCTGG 
TAATTGTATC 
CAACCCAAAC 
AAAACGGAGG 
GCCGGGGGGC 
ACAACTGCAG 
AAGTGTATCT 
ATAACTATGA 
AA6AAGATGG 
CATGCATGAA 
TTAAAAAAAA 
TCTCTCATTA 
AACAAGTACA 
GAGGTCTACA 
CCAATTTATA 
ACAACCTAAT 
TCATGTGGTG 
T6CTATTATT 
GGCATTGACT 



21 
I 

GGATGAGAGG 
CrCCGCCICG 
CCCGCGGAGC 
CATGGAGGCA 
GACCCTCGCG 
CTCCAAACTA. 
TGCAAATCTA 
TACAACAAAT 
CACTQAOAAC 
AAAGAAAAGA 
TCCTTOTTCG 
ATCTGACACG 
A6AAGCTCGG 
TGTAGATCGT 
6TATACTGCA 
CCCAATTTTT 
TACTGTGGGA 
GAAGTACTCC 
TACAGQQGTG 
GTIGAAAATA 
TTGTATCATT 
TGTGACATCA 
TAAGGACTTA 
AAATGGCAAT 
TAAGCCTTTG 
AGCTCCATTT 
TAATGTAGAA 
GAAAGAAAAT 
AAGAAGTA6C 
TGATGAAAAT 
CAAAAATGGC 
GGGGACACTG 
GACAGTGATC 
TGAGCCTATC 
QAGAATGTGO 
TCCTCCATTT 
TGTCACTTCA 
TGTAGATCCA 
ATTGTTGGGC 
GACXSTCTAAA 
AAACACAGAA 
TGTGGGCGCT 
TCAGGAGACC 
TGGCCACCAT 
ATACACTTAC 
GTGTAATCAA 
AGGAAGAGGA 
GCTTGAATTT 
GAGATGAGTG 
TTACAAACCA 
TTTGGATQGA 
CAAATTTTTC 
GAGAAATTAA 
GTGCAATAAA 
GGAAAATTGT 
CTTG6AAACT 
TTATTCTTOT 
ATTACAATTT 



31 

I 

CAGGCGCTTC 
C36CTCCTCCT 
CCTCCTACCC 
6CC0GCCCCT 
ATCTTAATAT 
GATGCCGAGA 
ATTCATTCAA 
ACTATTCTAT 
CAAOAAAAQA 
CATACTAAAG 
AT6CTAGAAA 
GCCCAA AACT 
AATTTATTTT 
GAG CAGTA TG 
GAACTTCCAC 
ACAGAAGAAA 
CAAGTGTGTG 
ATCATTGGGC 
ATCACCACAA 
AAAGTACAAG 
AACATTQATQ 
GTGGAAGAAA 
GTGAATACTG 
TTTAAAATT6 
AATTATGAAG 
TCCAGA6AGG 
GATCAGGATG 
GCA6AAGTG6 
AGTGGCATAA 
ACAGGATCAA 
ATATATAATA 
GGCATTATAC 
ATCTGCAAAC 
CATGGCCCAC 
AGACTGAAAG 
GGCTCATATG 
TTGGATGTTA 
AGGATTGGGG 
ATAGCATTGC 
CAACCAAAAG 
GCTCCTGGAG 
TCTGCTGAGO 
ATCGAAATGO 
CACACCCTGG 
TCGGAGTGGC 
GATGAAAATC 
TOGGTGGCIG 
TTGGATAATT 
TGTTCTAATA 
AGAATTTTTT 
ATCTCTTTGG 
AATTTTTACA 
AQTCTGCCTT 
ATGTAATTAA 
AGAGACCTTG 
6TTGTTTTCC 
AATOTGAOCT 
GATT 



41 

1 

AGAGAAGCTA 
GAGCAGC6GG 
C6GCCCGAC6 
COGGCTCCTQ 
TTGCCAGTGA 
AACTTGTTGG 
GTGATCCTGA 
TGTCCTOGGA 
AGAAAATATT 
AAAAAGTTCT 
ACTCCTTGGG 
ATACCATATA 
ATGTGGAGAG 
AATCTTTTGA 
TGCCCCTAAT 
CTTATACTTT 
CTACTGACAA 
AOGTGCCACC 
CATCAICTCA 
ACATG6ATG6 
ATGTAAATGA 
ATACAGTTGA 
CTAACTGGAG 
TAACAGATGC 
AAAA6CAACA 
CTAGTCCAAG 
AGGGCCCTGA 
GAACAACAAG 
66TATAAGAA 
TCAAA6TTTT 
TTACAGTCCT 
TTCAAGAC6T 
CCACCATGTC 
CCTTTGACTT 
CAATTAATGA 
TAGTACCTAT 
CACTGTGTGA 
GTGGAGGAQT 
TCTTTTGCAT 
TAATTCCTGA 
ATGACAAA6T 
GA6TTTGTGG 
TGAAAGGAG6 
ACTCCTGCAG 
ACAGTTTTAC 
ACAAGCATGC 
GGTCT6TAGG 
TGGA6CCCAA 
AGTCTCTGAA 
AAA6CAGAA6 
TCAAATGCAC 
TATTTTTAAA 
ATTTGTTACA 
TTCAA6TCCT 
CTTTAACATT 
TGAACATCTA 
TTTCACTGTG 



51 
I 

AGAAAAGCAC 
CCCAGACTCC 
CTCX3GCCGGC 
GAAOSGAGCC 
TGCCTGCAAA 
TAGAGTTAAC 
CTTCCAAATT 
GAAGAGAAGT 
TGTCTTTTTQ 
AAGGGGOBCC 
TCCTTTTCCA 
CTATTCCATA 
AGACACIGGA 
OArCAATTGCC 
AATOUUUITA 

tacaJvttttt 
agatgagcct 
atcaccgacx: 

GCTAO ACAGA 
TCAGTATTTT 
CCACTTGCCA 
TGTGGAAATC 
AGCTAATTAT 
CAAAACCAAT 
GATGATCTTG 
ATCAGCCATG 
GTGTAACCCT 
CAATGGATAT 
ATTAACTGAT 
CAGAAGCCT6 
TGCATCAGAC 
GAATGATAAC 
ATCTGCGGAG 
TAGTCTGGAG 
TACAGCAGCA 
AACAGT6AGA 
CTGCATTACC 
ACAACTTGGA 
CCTGTTTACG 
TGATTTAGCC 
GTATTCTGCG 
CAGCX3TG6GA 
ACACCAGACC 
GGGAGGACAC 
TCAGCCCCGT 
CCAAGACTAT 
TTGTTGCAQT 
ATTTAGGACA 
AGCCAGTGGC 
ATGCTATTTG 
ATTTACAGAG 
TTACTTATCT 
TTTQGGTATA 
TATTATAGAC 
ATCTCCAGTT 
AAGTGTGTAG 
CAAAGQGAGA 



60 
120 
IBO 
240 
300 
360 
420 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
7B0 
640 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1660 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2660 
2940 
3000 
3060 
3120 
3180 
3240 
33O0 
3360 
3420 



Seq ID HO I 617 Protein sequence 
Protein Accession #: np_077740.1 

1 11 21 31 41 51 

] I i i I I 

MEAARPSGSH NGALCRLLLL TLAILIPASD ACKNVTLHVP SKLDAEKLVG RVNLKECPTA 
AMLIHSSDPD FQILBDGSVY TTNTILLSSB KRSPTILLSN TENQEKKKIF VFLEHQTKVL 



60 
120 



424 



wo 02/086443 
kkrhtkekvl rhakrrwapi 

EPRNIiFYVER DTGNLYCTRP 
PIPTEBTYTP TIFBNCRVGT 
TGVITTTSSQ IiDRELIDKVQ 
VTSVEENTVD VBILRVTVED 
KPtNYEEKQQ MILQIGWNE 
KENAEVGTTS NQYKAYDPET 
KKQIYNITVL ASDQGGRTCT 
EPIK6PPFDF SLESSTSEVQ 
VTStDVTLCD CITENDCTHR 
TSKQPKVIPD DLAQQNLIVS 
QETIEMVKGG HQTSBSCRGA 
CHQDENHKIUV QDYVLTYNYS 
R ■ • ■' 



Seq ID NO.- S18 UNA aequenca 

Nucleic Acid Accession #: NM_004949.1 

Coding sequence: 202.. 2745 



PCSMLEHSLG 
VDREQYESFE 
TVGQVCATDK 
LKXKVQDMDO 
KDLVNTAiniR 
APFSRBA8PR 
RSSSGIRYKK 
GTLGIILQDV 
RHHRLKAIMD 
VDPRXGGGGV 
NTKAPGDDKV 
GHHHTLDSCR 
(SRGSVAGSVG 



PPPIiFLQQVQ 
IIAFATTPDG 
DBPDTMHTRL 
OYFOIiOTTST 
ANYTILKGNB 

SAMSTA-nrrv 

LTDPTGWVTI 
NSHSFFIPKK 
TAARLSYQHD 
QLGKNAILAI 
YSANGPTTQT 
GGHTEVDKCR 



11 



21 



CGCCAAAGGA 
CTCTCCGOGC 
GCTCCGGG06 
QACCTGCCCC 
CTCTGCCGGC 
AATGTGACAT 
CTGAAAGAGT 
TTGGAGGATG 
TTTACCATAT 
GAGCATCAAA 
AAGAGAAGAT 
CTTTTCCTTC 
A6AGGTCCTG 
AACTTOTATT 
TTTGCAACAA 
GAGGA.TGAAA 
GAAAATTCCA 
GACACGATGC 
CTATTTTCTA 
6AGTTAATTG 
GGTCTAGASA 
ACATTTACTC 
TTACGAGTTA 
ACCATTTTAA 
GAAGGAGTTC 
CAAATTOGTG 
AGCACAGCAA 
CCAATACAGA 
AAAGCATAT6 
CCAACAGGGT 
GATAGAGAGG 
CAAG6AGGGA 
AGCCCATTCA 
ATTGTTQCGQ 
AGTTCTACTT 
OQTCTTTCCT 
GATAGACTT6 
V QAAAATGACT 
AAfiTGGGCCA 
CTG6TCT6TG 
CAGCAGAACC 
AATGGCTTCA 
TCAGGAATCA 
TGGGAATCCT 
ACG6AGGTG0 
CTTGGTGAAG 
QTATCTGTGT 
CTATGAAGGA 
AGATGGGCTT 
CATGAAGAGA 
AAAAAATTAC 
TCATTATTTG 
AGTACACAAA 
TCTACAGAGA 
TTTATAGTGC 
CCTAATGGAA 

GTGGTGcrra 

ATTATTTTAT 
TTQACTATTA 



AAAGCCCCTT 
GCCCCACCTC 
C3SGCCCTCGC 
GAGCCCTCrC 
TGCTCCTGCT 
TACATGTTCC 
GCTTTACAGC 
GTTCAGTCTA 
TACTTTCCAA 
CAAA6GTCCT 
GGGCTCCAAT 
AACAGGTTCA 
GAGTTGACCA 
GTACTCGTCC 
CTCCAGATGG 
AT6ATAACTA 
GAGTGGGCAC 
ACACACGCCT 
TGCATCCAAC 
AC3UVGTACCA 
CAACTTCAAC 
GTACTTCTTA 
CT6TTGAGGA 
AGGQCAATGA 
TTTQTGTAGT 
TAGTTAATGA 
CA6TTACTGT 
CTGTTCGCAT 
ACCCAGAAAC 
GG6TCACCAT 
CAGAGACCAT 
GAACATGTAC 
TACCTAAAAA 
TT6ATCCTGA 
CAGAAGTACA 
ATCAGAATGA 
GCATGTCTAG 
GCACACATOG 
TCSmXSCAAT 
0G6CTTCTGG 
TAATTGTATC 
CAACCCAAAC 
AAAA06GAGG 
GTCGG6GG6C 
ACAACTGCAG 
AATCCATTAG 
AATCAAGATG 
AGAGGATCGG 
GAATTTTTGG 
TGAGTGTGTT 
AAACCAAGAA 
GATGGIVATCT 
TTTTTCAATT 
AATTAAAGTC 
AATAAAATGT 
AATTGTAGAG 

TCTTGTAATG 
CAATTTCATT 



GGATGAGAGG 
CTCCGCCTCX3 
CCCGCGGAGC 
CATGGAGGCA 
GACCCTCX5GG 
CTCCAAACTA 
TGCAAATCTA 
TACAACAAAT 
Ca^CTGAOAAC 
AAAGAAAAGA 
TCCTTGTTC6 
ATCTGACAC6 
AGAACCTCGG 
TGTAGATCGT 
GTATACTCCA 

cccaattttt 

TAGTGTGGGA 
GAAGTACTCC 
TACAGGCX3TG 
GTTGAAAATA 
TTGTATC3^TT 
TGIGACATCA 
TAA6GACTTA 
AAATGGCAAT 
TAAGCCTTTG 
AGCTCCATTT 
TAATGTAGAA 
QAAAGAAAAT 
AAGAAGTAGC 
TGATGAAAAT 
CAAAAATGGC 
GGGGACACTG 
QACAGTGATC 
TGAGCCTATC 
OAGAATGTGG 
TCCTCCATTT 
TGTCACTTCA 
TGTAGATCCA 
ATTGTTGGGC 
GACOTCTAAA 
AAACACAGAA 
TGTGGGCX3CT 
TCAGQAOACC 
TGGCCACCAT 
ATACACTTAC 
AGGACACACr 
AAAATCACAA 
TGGCTGGGTC 
ATAATTTGGA 
CTAATAAGTC 
TTTTTTAAAG 
CTTTQ6TCAA 
TTTACATATT 
TGCCTTATTT 
AATTAATTCA 
ACCTTGCTTT 
TTTTCCTGRA 
TGACCTTTTC 



31 

1 

CA0G08CTTC 
CX3CTCCTCCT 
CCTCCTACCC 
GCCOGCCCCT 
ATCTTAATAT 
GATOCXXMOA 
ATTCATTCAA 
ACTATTCTAT 
CAA6AAAAGA 
CATACTAAA6 
ATGCTAGAAA 
GCCCAAAACT 
AATTTATTTT 
GAQCAGTATG 
GAACTTCCAC 
ACAGAAQAAA 
CAAGTGTGTG 
ATCATTQGGC 
ATCACCACAA 
AAAGTACAAG 
AACATTGATG 
GTGGAAGAAA 
GTGAATACTG 
TTTAAAAniG 
AATTATGAAG 
TCCAGAGAGG 
GATCAGGATG 
GCAGAAGTGG 
AGTGQCATAA 
ACAGQATCAA 
ATATATAATA 
GGCATTATAC 
ATCTGCAAAC 
CATG6CCCAC 
AGACTGAAAG 
GGCTCATATG 
TTGGATGTTA 
AGGATTGGC6 
ATAGCATTGC 
CAACCAAAAO 
GCTCCTGGAG 
TCTGCTCAGG 
ATCGAAATGG 
CACACCCTGG 
TCGGAGTGGC 
CTGATTAAAA 
6C3WTGCCCAA 
TGTAGGTTGT 
GCCCAAATTT 
TCTGAAAGCC 
CAGAAGATGC 
ATGCAGATTT 
TTTAAATTAC 
GTTACATTTG 
AGTCCTTATT 
AAC3VTTATCT 
CATCCAAAGT 
ACIGTQCAAA 



SDTAQNYTIY 
YTPEIiPLPLI 
KYSIIGQVPP 
CZINIDDVKD 
MGNFKIVTDA 
NVEDQDBGPE 
DENTGSIK7F 
TVIICKPTMS 
PPFGSYWPI 
LLQIAIiliPCI 
VGASAQOVCG 
YTYSEMHSPT 
LEFLDNLEPK 



41 



YSIRGPGVDQ 
IKIEDE^a3NY 
SPTLFSMHPT 
HLPTFTRTSY 
KTNEGVLCW 
CNPPIQTVRM 
RSLDRBABTZ 
SAEIVAVDPD 
TVRDRL6MSS 
LPTLVCGASG 
TVGSGIKHGG 
QPRLGEKVYIi 
FRTLAEACMK 



180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 



AGAGAASCTA 

GAGCAGC6G0 
CG6CCCQACG 
CCGGCTCCTG 
TTGCCAGTGA 
AACTTGTTGG 
GTQATCCTGA 
TGTCCTCGGA 
AGAAAATATT 
AAAAAGTTCT 
ACTCCTTGGG 
ATACCATATA 
ATGTGGAGAG 
AATCTTTTGA 
TGCCCCTAAT 
CTTATACTTT 
CTACTGACAA 
AGGTGCCACC 
CATCATCTCA 
ACATOGATGG 
ATGTAAATGA 
ATACAGTTGA 
CTAACTGGAG 
TAACAGATGC 

aaaa6caaca 
ctagtcCaao 
agggccctga 

GAACAACAAG 
QOTATA AflAA 
TCAAAGTTTT 
TTACAGTCCT 
TTCAAGACGT 
CCACCATGTC 
CCTTTGACTT 
CAATTAATGA 
TAGTACXn'AT 
CACTGTGTGA 
GTGGAGGAGT 
TCTTTTGCAT 
TAATTCCTQA 
ATGACAAAGT 
GAGTTTGTGG 
TOAAAGGAGG 
ACTCCTGCAG 
ACAGTTTTAC 
ATTAAACAAT 
QACTATGTCC 
TGCAGTGAAC 
AGGACACTAG 
AQTGGCTTTA 
TATTTGTGGG 
ACAQ RGAQAC 
TTATCTTCTA 
GGTATAAT6A 
ATAGACTATT 
CCAGTTAATT 
GTGTA6ACTG 
GGGA6ATTTC 



51 

1 

AGAAAAOCAC 
CCCAGACTGC 
CTCGGCCCGC 
GAACGGAGCC 
TGCCTGCAAA 
TAGAGTTAAC 
CTTCCAAATT 
GAAGAGAAGT 
TGTCTTTTTG 
AAG6CGCGCC 
TCCTTTTCCA 
CTATTCCATA 
AGACACTGGA 
GATAATTGOC 
AATCAAAATA 
TACAATTTTT 
AGATGAGCCT 
ATCACCCACC 
6CTAGACA0A 
TCAGTATTTT 
CCACTTGCCA 
TGTGGAAATC 
AGCTAATTAT 
CAAAAC C3AT 
GATGATCTTG 
ATCAGCCATG 
GTGTAACCCT 
CAATGGATAT 
ATTAACTGAT 
CAGAAGCCTG 
TGCATCAGAC 
GAATGATAAC 
ATCTGCGGAG 
TAGTCTGGAG 
TACAGCA6C3^ 
AACAGTGAGA 
CTGCATTACC 
ACAACTTGGA 
CCTGTTTACG 
TOATTTAGCC 
GTATTCTGCG 
CACCGTGGGA 
ACACCAGACC 
GGGAGGACAC 
TCAGCCCCGT 
GAAAGAAA6T 
TGACATATAA 
QACAAOAAGA 
CAGAAGCATG 
TGACTTTTAA 
GGTTTTTCTC 
ACTATAAACA 
TCCAAGGAGG 
CAACAGCCAA 
TGAAGCACAA 
AAGTGTTCAT 
CATTCITQCT 
TAGCCAGGCA 



60 
120 

lao 

240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
13B0 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 



Seq ID NOs 619 Protein sequence 
Protein Accession «: NP_004940.1 



11 



21 



31 41 51 

ilEAARPSGSW NGALCRLLLL TIAILIPASD ACKUVTtHVP SKLDABKLVG HVMLKECFTA 60 
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ANLIHSSDPD FQILEDGSVY TTNTILLSSB KRSPTILLSN TEMQBKKKIF VPLEHQTKVL 120 

KKRHTKEKVL RRAKRRWAPI PCSMLENSLG PPPLPLQQVQ SDTAQNYTIY YSIRGPGVDQ 180 

BPRNLpyVER DTGNLYCTRP VDREQYESPE IIAPATTPDG YTPELPLPLI IKIEDBNDNY 240 

PIPTBBTYTF TIFENCRVGT TVGQVCATDK DEPDTMRTRL KYSIIGQVPP 8PTLFSMKPT 300 

5 TGVITTTSSQ LDRELIDKYQ LKIKVQDMDG OYFOUITTST CIINIDDVXID RLPTFTRTSY 360 

VTSVEENTVD VEILRVTVED KDLVNTAKWR ANYTILKONE NGNFXIVTDA XTNEOVLCW 420 

KPLNYEEKQQ MILQIGWNE APFSREASPH SAMSTATVTV NVEDQDEGPE CWPPIQTVRM 480 

KBNAEVGTTS NGYKAYDPET RSSSGIRYKK LTDPTGWVTI DENTGSIKVF RSLDRBAETZ 540 

KNGIYNITVL ASDQGGRTCT GTLGIILQDV NDNSPFZPKK TVIICKPTMS SAEIVAVDPD 600 

10 BPZH6PFFDF SLBSSTSEVQ RMMRLKAZND TAAJtLSYQND PPFGSYWPI TVRDRLGMSS 660 

VTSU>VTLCD CITEMDCTKR VDPRIGOGGV QLGKHAIIAI IiLGIAUiFCI IiFTIfVCGASG 720 

TSKQPKVIPO DLAQQKLXVS NTEAPGDDKV YSANGFTTQT VGASAQ6VG6 TVGSGIKNGG 7B0 

QBTIEMVKGG HQTSBSCR6A GHHHTLDSCR OCmTBVDNCR YTYSEWHSFT QPRLGBBSIR 840 
GHTLIKN 



15 



60 



70 



Seq ID NOt 620 DNA sequence 

Nucleic Acid Accession #i IIM_032545.1 

Coding sequence t 46 . . 718 



20 1 11 21 31 41 51 

I ] I 11 I 

AAACTGATCT TCAATGCACT AAGAGAAG6A GACTCTCAAA CCAAAAATGA CCTGQAGGCA 60 

CCATGTCAGG CTTCTGTTTA OGGTCAGTTT GGCATTACAG ATCATCAATT TGGGAAACAG 120 

^. CTATCAAAGA GAGAAACATA AOGGCGGTAG AGAGQAAGTC ACCAAG6TT6 CCACTCAGAA 180 

25 GCACOBACAO TCACCGCTCA ACTOGACCTC CAaXCATTTC GQAOAGGTOA CTGGGAGGGC 240 

CGAGGGCTGG GGGCOGGAGG AOCCXSCTCXrC CTACTCCCGG GCTTT06GA6 AGGGTGOSTC 300 

CGCGOGGCCG C6CTGCTGCA OGAAOSGOGG TACCTGOGTG CTGGGCAGCT TCTGCGTGTG 360 

CCCGGCCCAC TTCACaSGCC GCTACTGCGA GCATGACCAG AGGCGCAGTG AATGCGGCGC 420 

CCTGOAGCAC GGAOCCTQGA CCTTCCGCGC CTGCCACCTC TGCAGGTGCA TCTTCGGGGC 480 

30 CCT6CACT6C CTCCCCCTCC AGAGGCCTGA GOGCTGTGAC CXmAAGACT TCCTGGCXTrC 540 

CCACGCTCAC GG6C0GA6GG CCGGGGGC6C GCCCftGCCTG CTACTCTTGC T6CCCTGG6C 600 

ACTCCTGCAC CGCCTOCTGC GCCCGGATGC GCCCGCGCAC CCTCGGTCCC TOGTCCCTTC 660 

OOTOCrOCAG OSGGAGCGGC GCCCCTGCSG AAGGCCGGGA CTTGGGCATC GCCTTTAATT 720 

TTCTATGTTO TAAATAATAG ATGTGTTTAG TTTACCGTAA GCTGAAGCAC TGQOTGAATA 780 

35 TTTTTATTGO 6TAATAAATA TTTTCATOAA AGCX3CCAAAA AAAAAAAAAA AAAAAAAAAA 840 



40 



Seq ID NO; 621 Protein sequence 
Protein AcceBsion # : NP 115934 . l 



11 21 31 41 51 

I I I I I I ' 

MTWRHHVRLL FTVSLALQII NliGNSYQREK HNGGREEVTK VATQKHRQSP LMWTSSHFGB 60 

VTCSAEGHGP EEPLPYSRAF GBGASARPRC CRN6GTCVLG SFCVCPAHPT 6RYCEHDQRR 120 

45 SEaSALEKGA NTLRACKLCR CIFGAIiBC2iP LQTFDRCDPK DPLASHAHGP SAGOAPSLLL 180 
liLPCALLBRL LRFDAPAHPR SIiVPSVLQRE RRPCSRPGU3 HRL 

Seq ID NOi 622 DNA sequence 
Nucleic Acid Accession #i FGENESH predicted 
50 Coding sequence i 1..390 

1 11 21 31 41 51 

I 1 I I I I 

ATGAGGTTCA GTGTCTCAGG CATGAGGACC GACTACCCCA GGAGTGTGCT GGCTCCTGCT 60 

55 TATGTGTCAG TCTGTCTCCT CCTCTTGTGT CCAAGGGAAG TCATGGCTCC OGCTGGCTCA 120 

C5AACCATGGC TGT6CCAGCC GGCACCCAGG TGTGGAGACA AGATCTACAA CCCCTTGGAQ 180 

CAGTGCTGTT ACAATGACGC CATC6TGTCC CTGAGCGAGA CC08CCAATG TGGTCCCCCC 240 

TGCACCTTCT GGCCCTGCTr TGAGCTCTCC TOTCTTGATT CCTTTOQCCr CACAAACGAT 300 

TTTGTTGTGA AGCTGAAGGT TCAGG6T6TG AATTCCCAST GCCACTCATC TOCCATCTCC 360 
AGTAAATGTG AAAGAGGCC6. GATATGTTAG 

Seq ID NOt 623 Protein sequence 
Protein Accession #: FGEiaESH predicted 

65 1 11 21 31 41 51 

I 1 I I I 1 

MRFSVSGMRT DYPRSVLAPA YVSVOiLLLC PREVIAFAGS EPWLCQPAFR 06DKIYNPLE 60 
QOCYMDAIVS LSBTRQOGPP CTFWPCFBbC CLDSFGLmS PWKLKVQSV KSQCHSSPIS 120 
SKCERGRIC 



Seq ID NOi 624 OKA sequence 
Nucleic Acid Accession #: Mie728.1 
Coding sequence: 51..108S 



75 1 11 21 31 41 51 

11 i i I I 

GGAGCTCAAG CTCCTCTACA AA6AGGTGGA CAGAGAAGAC AGCA6AGACC ATG6GACCCC 60 

CCTCftGCCGC TCCCTGCAGA TT6CATGTCC CCTGGAAGQA QOTCCTGCTC ACAOCCTCAC 120 

TTCTAACCTT CTGGAACCCA CCCACCACTG CCAAOCTCAC TATTQAATCC ACGCCATTCA 180 

80 ATGTGGCAGA 6GGGAAGGA6 GTTCTTCTAC TCGCCCACAA CCTGCCCCAG AATCGTATTG 240 

GTTACAQCTG GTACAAAGGC GAAAGAGTGG ATGGCAACAO TCTAATTGTA GGATATGTAA 300 

TAGGAACTCA ACAA6CTACC CCAG6GCCC6 CATACAGTGG TOGAGAGACA ATATACCCCA 360 

ATGCATCCCT GCTGATCCAG AAC6TCAC0C AGAATGACAC AGGATTCTAT ACCCTACAAG 420 

TCATAAAQTC AGATCTT6TG AATGAAGAAO CAACCOGACA 6TTCCATGTA TACCCGGAGC 480 

85 TGCCCAAGCC CTCCATCTCC AGCAACAACT CCAACCCCGT GGAGGACAAG GATGCTGTGG 540 

CCTTCACCTG TGAACCTGAG GTTCAGAACA CAACCTACCT GTGGTGGGTA AATGGTCAGA 600 

GCCTCCOQGT CAGTCCCAGG CTGCAGCTGT CCAATGGCAA CATGACCCTC ACTCTACTCA 660 
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GCQTGAAAAG QMiCGKIGCA G62VTCCTAT6 AATGT6AAAT ACAGAACCCA GOGAGTGCCA 720 

ACOGCAGTOA CCX»6TCACC CTGAATGTOC TCTATGGCCC AGAT6TCCCC ACCATTTCCC 780 

CCTCAAAGOC CAATTACOGT GCAGGGGAAA ATCTGAACCT CTCCTGCCAC GC3M3CCTCTA 840 

ACCCACCTGC ACAGTACTCT TCGTTTATCA ATGGGACGTT CCAGCAATCC ACACAAGAGC 900 

TCTTTATCCC CAACATCACT GTGAATAATA GCGGATCCTA TATGTGCCAA GCCCATAACT 960 

CAGCCACTGG CCTCAATAGG ACCACABTCA OGATGATCAC AGTCTCTGGA AGTGCTCCTG 1020 

TCCTCTCRGC TGTGGOCACC GTOQQCATCA OGATTGGAGT GCTQ6CCAGG GTGGCTCTGA 1080 
TAXAGCA6CC CTQGTGTATT TTC6ATATTT CAGGAAGACT GGCAGATTGG ACCAGACCCT .1140 

GAATTCTTCT AGCTCCTCCA ATCCCATTTT ATCCCATGQA ACCACTAAAA ACAAGGTCTG 1200 

CTCIGCTCCT GAA6CCCTAT ATGCTGGAGA TGQACAACTC AATGAAAATT TAAAGGGAAA 1260 

ACCCTCAGGC CTGAGGTGTG TGCCACTCA6 AQACTTCACC TAACTAQAGA CAGTCAAACT 1320 

GCAAACCATQ GT6AGAAATT GA06ACTTCA CACTATGGAC AGCTTTTCCC AAGATGTCRA 13 BO 

AACAAOACTC CTCATCATGA TAAGGCTCTT ACCCCCTTTT AATTTGTCCT TGCTTATGCC 1440 

TGCCTCTTTC GCTTGOCAGG ATGATGCTGT CATTAGTATT TCACAA6AAG TAGCTTCAGA 1500 

GGQTAACTTA ACAGAGTGTC AGATCTATCT TGTCAATCCC AACX5TTTTAC ATAAAATAAO 1560 

AGATCCTTTA GTGCACCCAG TGACTGACAT TAGCAGCATC TTTAACACAG CCGTGTGTTC 1620 

AAATGTACAG TGGTCCTTTT CAGAGTTGGA CTTCTAGACT CACCTGTTCT CACTCCCTGT 1680 

TTTAATTCAA CCCAGCCATG CAATGCCAAA TAATAGAATT GCTCCCTACC AGCTQAACAG 1740 

GGAGGAGTCT GTQCAGTTTC TGACACTTGT TGTTGAACAT GGCTAAATAC AATGGGTATC 1800 

GCTGAQACTA AGTTGTAGAA ATTAACAAAT GTGCTGCTTG GTTAAAATOO CTACRCTCAT 1860 

CTQACTCATT CTTTATTCTA TTTTAGTTGG TTTGTATCTT GCCTAAGGTG OGTAGTCCAA 1920 

CTCTTGGTAT TACCCTCCTA ATAGTCATAC TAGTAGTCAT ACTCCCTGGT GTAGTG TATT 1980 

CTCTAAAAGC TTTAAATGTC TGCATGCAGC CAGCCATCAA ATAGTGAATG GTCTCTCTTT 2040 

GGCTGGAATT ACAAAACTCA GAGAAATGTG TCATCAGGAG AACATCATAA CCCATGAAGG 2100 

ATAAAAGCXX: CAAATGGT6G TAACT6ATAA TAGCACTAAT GCTTTAA6AT TTGGTCACAC 2160 

TCTCACCTAQ GTGAGCGCAT TGA6CC»GT0 GTQCTAAATG CTACATACTC CAACTGAAAT 2220 

OTTAAGGAAG AAGATAQATC CAATTAAAAA AAATTAAAAC CAATTTAAAA AAAAAAAA6A 2280 

ACACAGGAGA TTCCAGTCTA CTTGAGTTAG CATAATACAG AAGTCCCCTC TACTTTAACT 2340 

TTTACAAAAA AGTAACCIGA ACTAATCTGA TGTTAAOCAA TOTATTTATT TCTGTGGTTC 2400 

TQTTTCCTTQ TTCCAATTTG ACAAAACCCA CTOTTCTTOT ATTOTATTGC CCAGOQGOAG 2460 

CTATCaCTGT ACTTGTAGAG TGGTGCTGCT TTAATTCATA AATCACAAAT AAAAOCCAAT 2520 
TAGCTCTATA ACT 

Seq ID no I 625 Protein sequence 
Protein Accession #t AAA59907.1 

1 11 21 31 41 51 

1111)1. 

MGPPSAPPCR LHVPHKEVLL TASIiLTFWNP PTTAKLTIES TP7NVABGKE VUjLAHNIjPO 60 

NRIGYSWyKQ ERVDGNSLIV GWXGTQQAT P0PAYS6RET lYPlXASIiLIQ NVTQNDTGPy 120 

TLQVIKSDLV NBBATOQFHV YPEIiPKPSZS SNNSNFVEDK OAVAFTCBPB VGHTTYIMW 180 

NGQSLPVSPR LQLSNGNMTL TLLSVKHNDA GSYECEIQNP ASANRSDPVT LIIVLYGPDVP 240 

TISPSKANYR PGENLNLSCH AASNPPAQYS WFINGTFQQS TQBIiFIPNIT VHNSOSYMOQ 300 
AHNSATGZiNR TTVTNITVSG SAPVLSAVAT VGZTIGVLAR VALI 

Seq ID NO I 626 DNA seqoence 
Nucleic Acid Accession 9t N1B72B.1 
Coding sequence: 1355.. 1657 

1 11 21 31 41 51 

I ) I I I 1 

GGAGCTCAAG CTCCTCTACA AAGAGGTGGA CAGA6AAGAC AGCAGAGACC ATGGGACCCC 60 

CCrCAGCCCC TCCCTGCAGA TTGCATGTCC CCTGGAAGGA GGTCCTGCTC ACAGCCTCAC 120 

TTCTAACCTT CTGGAACCCA CCCACCACT6 CCAAGCTCAC TATTGAATCC AOGCCATTCA 180 

AT0TC6CAGA GGGGAAGOAb QTTCTTCTAC TCGCCCACAA CCTGCCCCAQ AATCGTATTG 240 

GTTACAGCTG GTACAAAOGC OAAAGAGTGG ATGGCAACAG TCTAATTGTA GGATATGTAA 300 

TAGGAACTCA ACAAGCTACC CCAGGGCCCG CATACAGTG6 TOGAGAGACA ATATACCCCA 360 

ATGCATCCCT GCT6ATCCAG AACGTCACCC AGAATGACAC AQQATTCTAT ACCCTACAAG 420 

TCATAAAQTC AGATCTTGTG AATQAAGAAQ CAACOGGACA GTTCCATGTA TACCOGGAGC 480 

TGCCCAAGCC CTCCATCTCC AGCAACAACT CCAACCCCGT GGAGGACAAG GATGCTGTGG 540 

CCTTCACCTG TGAACCTGAG GTTCAGAACA CAACCTACCT GTGGTGGGTA AATGGTCAGA 600 

GCCTOCCGGT CAGTCCCAGG CTGCAGCTGT CCAATGGCAA CATGACCCTC ACTCTACTCA 660 

GOGTCAAAAG GAAOGATGCA GGATCCTATG AATQTGAAAT ACAGAACCCA GCGASTGCCA 720 

ACCGCAGTGA CCCAGTCACC CTGAATGTCC TCTATGGCCC A6AT6TCCCC ACCATTTCCC 780 

CCTCAAAGGC CAATTACCGT CCAGGGGAAA ATCTGAACCT CTCCTGCCAC GCAGCCTCTA B40 

ACCCACCTGC ACAGTACTCT T6GTTTATCA ATGGGACGTT CCAGCAATCC ACACAAGAGC 900 

TCTTTATCCC CAACATCACT GTGAATAATA GCGGATCCTA TATGTGCCAA GCCC ATAACT 960 

CAGCCACTGG CCTCAATAGG ACCACAGTCA GGATOATCAC AGTCTCTGGA ASTGCTCCTO 1020 

TCCTCTCAGC TGTGGCCACC GTCGGCATCA CGATTaOAGT GCTGGCCAGG GTGGCTCTGA 1080 

TATAGCAGCC CTQGTGTATT TTCGATATTT CAGGAAGACT GGCAGATTGG ACCAGACCCT 1140 

GAATTCTTCT AGCTCCTCCA ATCCCATTTT ATCCCATGGA ACCACTAAAA ACAAGGTCTG 1200 

CTCTGCTCCT GAA6CCCTAT ATGCTGGAGA T6GACAACTC AATGAAAATT TAAAGGGAAA 1260 

ACCCTCAGGC CTQAOGTGTG TGCCACTGAO AOACTTCACC TAACTAGAGA CAGTCAAACT 1320 

GCAAACCATQ GTGAGAAATT GACQACTTCA CACTATGGAC AGCTTTTCCC AA&AT6TCAA 1360 

AACAAGAOTC CTCATCATGA TAAGGCTCTT ACCCCCTTTT AATTTGTCCT TGCTTATGCC 1440 

TGCCTCTTTC GCTTGGCAGG ATGATGCTGT CATTAGTATT TCACAAGAAG TAGCTTCAGA 1500 

GGGTAACTTA ACAGAGTGTC AGATCTATCT TGTCAATCCC AACGTTTTAC ATA AAATA AG 1560 

AGATCCTTTA GTGCACOCAG 7GACTGACAT TAGCAGCATC TTTAACACAG COQTGTOTTC 1620 

AAATGTACAG TGGTCCTTTT CAGAGTTGGA CTTCTAGACT CAC C TGTTCT CACTCCCTGT 1680 

TTTAATTCAA CCCAGCCATG CAATGCCAAA TAATAGAATT GCTCCCTACC AGCTGAACAG 1740 

GGAGGAGTCT GTGCAGTTTC TGACACTTGT TGTTGAACAT GGCTAAATAC AATGGGTATC 1800 

GCTGAGACTA AGTTGTAGAA ATTAACAAAT GTGCTGCTTG GTTAAAATGG CTACACTCAT 1860 

CTGACTCATT CTTTATTCTA TTTTAGTTGG TTTGTATCTT GCCTAAGGTG CGTAGTCCAA 1920 

CTCTTGGTAT TACCCTCCTA ATAGTCATAC TAGTAGTCAT ACTCCCTGGT GTAGTGTATT 1980 

CTCTAAAAGC TTTAAATGTC TGCATGCAGC CAGCCATCAA ATAGTGAATG GTCTCTCTTT 2040 

GGCTGGAATT ACAAAACTCA GAGAAATGTG TCATCAGGAG AACATCATAA CCCATGAAGG 2100 

ATAAAAGCCC CAAATGQTGG TAACTGATAA TAGCACTAAT GCTTTAAQAT TTGGTCACAC 2160 
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TCTCACCTAO GTGAGCGCAT TGAGCCAGTG GTGCTAAATG CTAC31TACTC CAACTGAAAT 2220 

GTTAAGGAAG AAGATAGATC CAATTAAAAA AAATTAAAAC CAATTTAAAA AAAAAAAAGA 2280 

ACACAGGAGA TTCCAGTCTA CTTGAGTTAG CATAATACAG AAGTCCCCTC TACTTTAACT 2340 

TTTACAAAAA AGTAACCTGA ACTAATCT6A TGTTAACX»A TGTATTTATT TCTGTGGTTC 2400 

TCTTTCCTTG TTCCAATTTO ACAAAACGCA CTOTTCTIGT ATTGTATTGC OCAGGGGQAO 2460 

CTATCACTGT ACTTGTAGAG TGOTGCTOCT TTAATTCATA AATCACAAAT AAAAGCCAAT 2520 
TAGCTCTATA ACT 

Seq IV HOt 627 Protein sequence 
PrQtein Accession ftt AAA59908.1 

1 11 21 31 41 51 

I i ) 1 i I 

MDSFSQOVKT RLLIMIRIiLP PFMLSUjMPA SFAHQDOAVI SISQEVASEX3 NLTECQIYLV 60 

NPNVLHXIRO PLVKFV7DZS SIFNTAVCSH VQHSFSEIiDF 

Seq ID KO: 626 DNA sequence 
Nucleic Acid Accession M1872B.1 
Coding sequence: 2370.. 2501 

1 11 21 31 41 SI 

I I I i 1 I 

GGAGCTCAAG CTCCTCTACA AAGAGOTGGA CAGAGAAGAC AGCAGAGACC ATGGGACCCC 60 

CXrrCAGCCCC TCCCTGCAGA TTGCATGTCC CCTGGAAGGA GGTCCTGCTC ACAGCCTCAC 120 

TTCTAACCTT CTGGAACCCA CCCACCACTO CCAAGCTCAC TATT6AATCC AG6CCATTCA 180 

ATGTCOC5U3A GGGSAAGQAQ GTTCTTCTAC TOOCOCACAA CCTGCCCCAG AATCGTATTO 240 

OTTACAGCTO C3TACAAAGGC GAAAGAGTGG ATGGCAACAG TCTAATTGTA GQATATGTAA 300 

TA6GAACTCA ACAAGCTACC CCAGGGCCCG CATACAGTGG TCGAGAGACA ATATACCCCA 360 
AK3CATCCCT GCTGATGCA6 AACX3TCACCC AGAATGACAC AGGATTCTAT ACCCTACAAG . 420 

TCATAAAGTC AOATCrTGTG AAT6AAGAAG CAACGGGACA GTTCXATGTA TACCOQ^OC 480 

TGCCCAAGGC CTOCATCTCC AGCAACAACT CCAACCCOGT GGAGGACAAG 6ATGCTGIG3 S40 

CCTTCACCTG TGAACCTGAG GTTCA6AACA CAACCTACCT GTGGTGGGTA AATGOTCAGA 600 

6CCTCCCGGT CAGTCCCAGG CTGCAGCTGT CCAATGGCAA CATGACrCTC ACTCTACTrCA 660 

GOGTCAAAAG GAAGGATGCA GGATCCTATG AATGTGAAAT ACAGAAOCCA GCGAGT6CCA 720 

ACC6CA6TQA CCCAGTCACC CTGAATQTCC TCTATOOCCC AGATGTCCCC ACCATTTCCC 780 

CCTCAAAGGC GAATTAC06T CCAGGGQAAA ATCTOAACCT CTCCT60CAC GCAGCCTCTA 840 

ACCCACCTGC ACAGTACTCT TGQTTTATCa ATGGGACQTT CCAGCAATCC ACACAAGAGC 900 

TCTTTATCCC CAACATCACT GTGAATAATA GCGGATCCTA TATGTGCCAA GCCCATAACT 960 

CAGCCACTGG CCTCAATAGG ACCACAGTCA CGATGATCAC AGTCTCTCGA AGTGCtCCTQ 1020 

TCXn:CTCAQC TGTGQGCACC GTGOGCATCA OQATTQGAOT 6CTG6CCAGG GTGGCTCTGA 1080 

TATA6C3U3GC CTGGTGTATT TTOGATATTT CAGGAAGACT GGCAGATTGG ACCAGACCCT 1140 

GAATTCTTCT AGCTCCTOCA ATCCCATTTT ATCCCATGGA ACCACTAAAA AC3UW3GTCTG 1200 

CTCTGCTCCT GAAGCCCTAT ATGCTGGAGA TGGACAACTC AATQAAAATT TAAAGGQAAA 1260 

ACCCTCAGGC CTGAGGTGTG TGCCACTCAG AGACTTCACC TAACTAQAGA CAGTCAAACT 1320 

GCAAAGCATG 6TGAQAAATT GAOjACTTCA CACTATGGAC AGCTTTTCCC AAGATGTCAA 1380 

AACAAGACTC CTCATCATGA TAAGOCTCTT ACCCCCTTTT AATTTGTCCT TGCTTATGCC 1440 

TGCCTCTTTC GCTTGGCAGG ATGATGCTGT CATTAGTATT TCACAAQAAQ TAGCTTCAGA 1500 

GGGTAACTTA ACA6AGTGTC AGATCTATCT TGTCAATCCC AAOOTTTTAC ATAAAATAAG 1560 

AGATCCTTTA GTGCACCCAG TGACTGACAT TAGCAGCATC TTTAACACAG COGTGTGTTC 1620 

AAATGTACA6 TGGTCCTTTT C3W3AGTTGGA CTTCTAQACT CACCTGTTCT CACTCCCTGT 1680 

TTTAATTCAA CCCAGCCATG CAAT6CCAAA TAATAGAATT GCTCCCTACC AGCT6AACAG 1740 

GGA6GA0TCT GTGCaUSTTTC TGAC3VCTTGT TGTTGAACAT GGCTAAATAC AATGGGTATC 1800 

QCTQAGACTA AGTTGTAGAA ATTAACAAAT GT6CT6CTTG GTTAAAAI06 CTACACTCAT 1860 

CTGACTCATT CTTTATTCTA TTTTAGTTGG TTTGTATCTT GCCTAAG6TG OGTAGTCCAA 1920 

CTCTTGGTAT TACCCTCCTA ATAGTCATAC TAGTAGTCAT ACTCCCTGGT GTAGTGTATT 1980 

CTCTAAAAGC TTTAAATGTC TGCATGCAGC CAGCCATCAA ATAGTGAATG GTCTCTCTTT 2040 

GGCTGGAATT ACAAAACTCA GAGAAATGTG TCATCAGGAG AACATCATAA CCCATGAAGG 2100 

ATAAAAGCCC CAAATGGTGG TAACTGATAA TAGCACTAAT GCTTTAAOAT TT6QTCACAC 2160 

TCTCACCTAG GTGAGCGCAT TGAGCCAGTG GTGCTAAATG CTACATACTC CAACTGAAAT 2220 

GTTAAGGAAG AAGATAGATC CAATTAAAAA AAATTAAAAC CAATTTAAAA AAAAAAAAGA 2280 

ACACAGCSAGA TTCCAGTCTA CrTGAGTTAG CATAATACAG AAGTCCCCTC TACTTTAACT 2340 

TTTACAAAAA AGTAACCTGA ACTAATCTGA T6TTAACCAA TGTATTTATT TCTGTGGTTC 2400 

TGTTTCCTTO TTCCAATTTO ACAAAACCCA CTGTTCTTGT ATTGTATTGC CCAGGGOGAG 2460 

CTATCACTGT ACTTGTAGAG TQGTGCTGCT TTAATTCATA AATCACAAAT AAAAGCCAAT 2520 
TAGCTCTATA ACT 

Seq ID no I 629 Protein sequence 
Protein Accession «: AAA59909.1 

1 11 21 31 41 51 

1 I I 1 I I 

MLTNVFI8W LPPCSNIiTKP TVLVLYCPGG AITVLVEWCC ENS 



Seq ID NO: 630 DNA sequence 

Nucleic Acid Accession #: MM_016639.1 

Coding sequence: 40.. 429 

1 11 21 31 41 51 

I I I t I i 

GCGGCGGGCG CAGACAGCGG CGGGOGCAGG ACGTGCACTA TGGCTCGGGQ CTC6CTGCGC 60 

CGGTTGCTGC GGCTCCTCGT GCTGGGGCTC TGGCTGGCGT TGCTGCGCTC CGTGGCCGGG 120 

GAGCAAGOQC CAGGCACCGC CCCCTGCTCG C6CGGCAGCT CCTGGAGCGC GGACCTGGAC 180 

AAGTGCATGO ACTGCGCGTC TTGCAGGGCQ 06ACCGCACA GCGACTTCTG CCTGGGCTGC 240 

GCTGCAGCAC CTCCTGCCCC CTTCCQGCTG CTTTGGCCCA TCCTTGGGGG CGCTCTGAGC 300 

CTGACCrTOG TGCTGGGGCT GCTTTCTGGC TTTTTGGTCT GGAGACGATG CCGCAGGAGA 360 

GAGAAQTTCA CCACCCCCAT AGAGGA6ACC GGG6GA6A6G GCTGCGCA6C T6TGG0GCTG 420 
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ATCCAGTGAC AATGTGCCCC CTGCCAGCCG GGGCTCGCCC ACTCATCA.TT CATTCATCCA 
TTCTAGAGCC AGTCTCTGCC TCCCAQACGC GGCGOOAflCC AAOCTCCTCC AACCACAAGG 
GGGGTGGGGG GCGGTGAATC ACCTCTGAGG CCTGGGCCCA GGOTTCAGGG GAACXTTTCCA 
AGGTGTCTGG TTGCCCTOCC TCTGGCTCCA GAACAGAAAG GGAGCCTCAC GCTGGCTCAC 
ACAAAACAGC TGACACTGAC TAAG6AACTG CAGCATTTGC ACAGGGGAGG GGGGTGCCCT 
CCTTCCTTAG GACCTGGGGG CCAGGCTGAC TTGGG6GGCA GACTTGACaC TAGGCCCCAC 
TCACTCAGAT GTCCTGAAAT TCCACCftCGG GGGTCACCCT GGGGQGTTAG QQACCTATTT 
TTAACACTAO GGGCTGGCTC ACTAGGAGG6 CTGGCCCTAA GATACAQAOC COCCCAACTC 
CCCAftAfiCGO, GGAG6AQATA TTTATTTTGG GGASAGTTTO GAGGGGAGGO AQAATTTATT 
AATAAAAQAA TCTTTAACTT TAAAAAAAAA 



PCT/US02/12476 



480 
540 
€00 
660 
720 
780 
840 
900 
960 



8eq ID NO: 631 Protein sequenca 
Protein Accession fts NP_057723.1 

11 



41 



51 



1 11 21 31 ^ ^ 

HARGSLRRIiXi RLLVLGLWLA lARSVAQEQA PGTAPCSRGS. SW8ADLDKCM DCA8CRARPH 
SDFCLGCAAA PPAPFRLLWP ILGGALSLTF VLGI,I.SGFI.V WRRCRRRBKP TTPrBETQGE 
GCPAVALIQ 

Seq ID NO: 632 DNA sequence 

Nucleic Acid Accession #: KH_003 816.1 

Coding sequence: 7 9.. 2 53 8 



11 



21 



OGGCAGGGTT 
CCT6CGGAAT 
OGGTGGTTGC 
CAACAGACCT 
AQAAGAGAAG 
AAAGAGCATA 
TATACtTACA 
CATTATCX3GG 
GGACTCAGAG 
AGCTCTCATT 
TGtGGAGTTT 
CCCAGCATGA 
GAGCTGTTCA 
GTGAGAGAAG 
ATTOGAATTG 
GGGG6TGCTG 
OGTOGGAGAC 
ATOGCATTTG 
CAAATCACTG 
ATGAATCACG 
GGAGCATCGG 
TTAAATAAAG 
CCCTCCTGTG 
QAATGTQAAT 
TGTGCATATG 
GGAAAAACCA 
CCAGATGTTT 
GGCATGT6CC 
GCCCCCAAAG 
TTCTCK56CA 
TGTGAGAATG 
AGTCGAGGCA 
QGGA7GGTTA 
GTAGATGCTT 
GTATGTAATA 
ACTAAAGGAT 
TTGAGGGACG 
TTTATCTTCA 
ACATA'SOAGT 
CQACATGTTT 
GTACCAACCT 
CCQAAAGTAT 
TATA6TTCCC 
CTAATACTTT 
GAAAACAAAA 
AGTTGTGAAA 
CATCATTGAA 
TGAACATGTT 
AGTGTTTAAG 
AACATGTGAT 
TTTTTCATCA 
CATOAATAAG 
TTATTTTGAA 
TCCATTTTTA 
GAATTTCTAT 
TAAATTATAA 
GGCTATAATA 
CTTGAGAATT 
AGAATGTTTA 
CATAGAAATT 
TTACTGTGGT 



GGAAAATQAT 
CGGCCGAGAT 
TOTTGCTTOG 
CACATCTTTC 
CCCCTAGGCC 
TTATTCACTT 
ACAAOGAAGG 
GCTATGTGGA 
GATTGCTGCA 
TTGAGCACAT 
CCAACAAGGA 
CrCAGCTACT 
TTGTCGTAGA 
AQATGATTCT 
TGCTAGTTGG 
GTGATGTGCT 
ATGACAGTGC 
TGGGAACAGT 
TGGAGACATT 
A76ATGGGAG 
GTTCCAGAAA 
GAGGAAACTG 
GTAATAAGTT 
TGGACCCTTO 
GTGACTGTTG 
GTGAGTGTQA 
TTATTCAGAA 
AGTATTATGA 
ATTGTTTCAT 
ATGAATACAA 
TACAAGAGAT 
CCAAATGTTG 
A0GAA6GCAC 
CTOTTCTGAA 
GCAATAAQAA 
AOGGAGGAAG 
GACTTCTGGT 
TCAAGAGGGA 
CAGATGGCAA 
CTCCAGTGAC 
ATGCAGCCAA 
CATCTCAGGG 
TCACTTGATT 
TTTTTTTTCT 
CACCACAAAA 
TACAAGGAAA 
TAAGTCTTAT 
ATTGCAGTGA 
TGTTATrCTG 
AATCTAATAC 
TGCA06AATT 
CAAATATTGT 
AGTACAAAAT 
TGACCTTTCA 
TATGAATCAT 
GCTTTAAGGT 
AAOCAGGAGC 
TCATGAGCAC 
CATTTACTAA 
AGGCTGGAGA 
ATCTATGAGT 



GGAAGAGGCG 
GGG6TCTGGC 
CCTGGTGGGC 
TTCTTATQAA 
CTATTCAAAA 
GGAAAGGAAC 
GACTTTAATC 
GGGAGTTCAT 
TTTAGAGAAT 
CATTTAT06A 
TATAGA6AAA 
T06AAGAAGA 
CAAGGAAA6G 
CCTGGCAAAC 
ACrGGAGATT 
GGGGAACTTC 
ACAGCTAGTT 
GTGTTCAAGG 
TGGTTGCAIT 
AGATTGTTCC 
CTTTAGCAGT 
CCTTCTTAAT 
GGTGGACGCT 
CT6CGAAGQA 
TAAAGACT6T 
TGTTCCAGAG 
TGGATATCCT 
TGCTCAATGT 
TGAAGTGAAT 
GAAGTGTSCC 
ACCTGTATTT 
GGGTGTGGAT 
AAAATGTGGT 
TTATGACTGT 
TTGTCACTQT 
TGTGGACAGT 
CTTCTTCTTC 



TCAACTGTGG 
AAATCAA6CA 
ACCTCCCAGA 
GCAACCTCAG 
AAACTTAATT 
TTTTTAACCT 
TGATGTTTTC 
CAGACTTCAC 
TGCAGTAAAG 
TCAGTCATCG 
TTCTCAAATT 
AATTTTCTAC 
CTGTGAAAAC 
AATAATCATC 
CTTCAAAAGA 
ATACTAAAAG 
ACTATAGGTA 
GTGAAAGCAT 
ACGAAGTATT 
AATTATAAAA 
TTTAAAATCT 
GGTGTGCTGG 
AAGAAGGAAG 
TATCATCTTA 



31 
I 

GAGGT6GAGG 
GCGCX3CTTTC 
CCAGTCCTCG 
ATTATAACTC 
CAW5TATCTT 
AAAGACCTTT 
ACTGACCATC 
AATTCATCCA 
GGGAGTTATG 
ATGGATGATG 
GAAACTGCAA 
AQAGCTGTCT 
TATGACATGA 
TACITGGATA 
TGGACCAATG 
GTGCAGTGGC 
CTAAAGAAAG 
AGCCACGCAG 
GTT6CTCATG 
TOTGGAGCAA 
TGCAGTGCA6 
ATTCCAAAGC 
GGGGAAGAGT 
AGTACCT3TA 
CXSOTTCCTTC 
TACT6CAATG 
TGCCAGAATA 
CAAGTCATCT 
TCTAAAGGTG 
ACTGGGAATG 
GGAATTGTGC 
TTCCAGCTAG 
GCTGGAAAGA 
GATGTTCAGA 
GAAAATGGCT 
GGACCTACAT 
CTAATTGTTC 
AGAAQCTACT 
AACCCTTCTA 
GAAGTTCCTA 
CAGTT CCCAT 
CCTGCCCGTC 
TCTTTTTGCA 
TTGAAAAGCC 
TAACACAGAA 
CCAGGGAATT 
GTGAGGTTAA 
AACTGTATT6 
CTTAGTTATC 
TGACTAATCA 
ATACTCTAGA 
ATGCACAA6A 
AGTGTGTGTG 
ATAACTCTTA 
GACATTOGTT 
TAATAGATCT 
TCTTCAATCA 
OAACTTTCAA 
GTCATGTAAA 
AAATGGTTTT 
GCTGTGTTAA 



41 



CGACCGAGTG 
CCTCGGGGAC 
GTGCGGCGCG 
CTTGGAGATT 
ATGTTATTCA 
TGCCTGAAGA 
CCAATATACA 
TTGCTCTTAG 
GGAriGAACC 
TCTACAAAGA 
AGGATGAAGA 
TGCCACAGAC 
TGGGAAQAAA 
GXATGTATAT 
6AAACCTGAT 
GGGAAAAGTT 
GTTTTGGTGG 
GCGGGATTAA 
AATTGGGTCA 
AGA6CTGCAT 
AGGACTTTGA 
CTGATGAAGC 
GTGACTGTGG 
AGCTTAAATC 
CAGGAGGTAC 
GTTCTTCTCA 
ACAAAGCCTA 
TTGGCTCAAA 
ACAGATTTGG 
CTTT6TGTGG 
CTGCTATTAT 
6ATCAGAT0T 
TCTGTAGAAA 
AAAAGTGTCA 
GGGCTCGCCC 
ACAATGAAAT 
CCCTTATTGT 
TCAGAAAGAA 
GACAGCCGGG 
TATATGCAAA 
CAAGGCCACC 
CTGCTCCIGC 
AATGTCTTCA 
TTTCTGTTGC 
AAACAGAAAC 
TACAATAACA 
TGCACTAATC 
GTGTAAGATT 
ATTAATGTAG 
GCTGCCAATA 
ATCTTGTCTG 
ACCACAATTA 
TATTCACGCA 
GAGAAATTAA 
CACAATAGCA 
AATCAAATAT 
ATTGAACTTT 
AGCTTGCTAT 
ATATTAGACA 
CTTAAATACC 
AAA'XGAATTT 



60 
120 



51 
I 

CTGAGAGGAA 
CCTTCGTGTC 
GCCAGGCTTT 
AACTAGAGAA 
GGCTGAAGGA 
TTTTGTGGTT 
GAATCATTGT 
CGACTGTTTT 
CCTGCAGAAC 
GCCTCTGAAA 
GGAAGA6CCT 
CCGGTATGTG 
TCAGACTGCT 
TATOTTAAAT 
CAACATAGTT 
TCTTATCACA 
AACTGCAGGA 
TGTGTTTGGA 
TAATCTTGGA 
CATQAATTCA 
GAAGTTAACT 
CTATAGTGCT 
TACTCCAAAG 
ATTTGCTGAG 
TTTATGCOGA 
GTTCTGTCAG 
TTGCTACAAC 
AGCCAAG6CT 
CAATTGTGGT 
AAAGCTTCAG 
TCAAACGCCT 
TCCAGATCCr 
CTTCCAGTOT 
TX3GACATGGG 
AAATTGTGAG 
GAATACTGCA 
CTGTGCTATT 
GAGATCACAA 
GAGTGTTCCT 
CAGATTTGCA 
TCCACCACAA 
ACCTCCTTTA 
GGGAACTGAG 
AACTATGAAT 
TGAGTGTGAG 
TTTCCGTTTC 
ATGGATTTTT 
TTTGTCATTA 
TTCCTCATTG 
ATATCTAATA 
TCACTCACTA 
AGATGTCATA 
GTTACTOGCT 
TTTAATATTA 
CTATTTTAAA 
GTTGATTCAT 
TACAAAACCA 
TAAATCATTT 
CTAATATTTT 
TACAAAAAAG 
TTACTATGGC 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
B40 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3160 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 



429 



wo 02/086443 

AGATAT6GTA T6GATCGTAA AATTTTAAGC ACTAAAAATT TTTTCATAAC CTTTCATAAT 3720 

AAA6TTTAAT AATAGCSTTTA TTAACTGAAT TTCATTAGTT TTTTAAAAGT GmTTGOTT 3780 

TGTGTATATA TACATATACA AATACAACAT TTACAATAAA TAAAATACTT QAAATTCTCA 3840 
AAAAAAAAAA AAAAAAAAAA AAAAA 

8eq ID NO* 633 Protein sequence 
Procein Accession #: HP_003 807.1 

1 11 21 31 41 51 

I I I I I i 

KGSGARFPS6 TLRVRWUiLL GLVGPVLGAA RPGFQQTSHL 8SYEIITPNR LTRERREAFR €0 

WSKQVSYVI QAEGKEHIIH LERNKDLLPE OPWYTYNKE GTLITDHPNI QKHCHYRaYV 120 

BSVHNSSIAL SDCFGLRGLL HLENASYGIB PLQNSSHFEH IIYRMDDVYK EPLKCXSVSNK 180 

OIEKETAKDE ESEPPSMTQL IjRBRKAVLFQ TRYVELFIW DKBRYDMM6R KQTAVREEMI 240 

LLANYUJSMY IMLNIRIVLV GLEIWTNGNL INIVGGAGDV LGNFVQWRER FliITRRRHDS 300 

AQLVLKKGPG GTACaiAFVGT VCSR8HAGGI NVPQQITVET FASIVAHELG HNLGMNHDDG 360 

RDCSCGAKSC IMNSGASGSR NPSSCSAEDF EKLTLHKGGH CLLNIPKPDB AYSAPSCGNK 420 

LVDA3BBCDC GTPKECBLDP CCBGSTCKIjK SPAECAYGDC CKDCRFLPGG TLCRGKTSEC 480 

DVPEYCKQSS QFCQPDVPIQ KGYPCQNNKA YCYNQMC3QYY DAQCQVIPGS KAKAAPMDCP 540 

lEVNSKGDRP GNCGPSGNEY KKCATGNALC GKLQCENVQB ZPVF6IVFAZ IQTPSRGTKC 600 

WGVDFQLGSD VPDPGMVNEG TKCX3AQKICR MFQCVDASVL NYDCDVQXKC HGHGVOfSNK 660 

NCHCENGWAP PNCETKOYGQ SVDSGPTYNE MNTALRDGLL VFPFIiIVPIiI VCAIFIFIKR 720 

DQIiWRSYFRK KRSQTYESDG KNQANPSRQP GSVPRHVSPV TPPREVPIYA NRFAVPTYAA 780 
RQPQQPPSRP PPPQPKVSSQ GKLIPARPAP APPLYSSLT 

Seq ID NO: 634 DKA sequence 

Nucleic Acid Accession #i NM_002091.1 

Coding seqpience; 56.. 503 

1 11 21 31 41 51 

I I I I I I . 

AGTCTCTGCT CTTCCCAGCC TCTC06GC6C GCTCCAAGGG CTTCCCGTCG GGACCATOOG 60 

CGGCAGTGAG CTCCCGCTGG TCCTGCTGGC GCTGGTCCTC TGCCTAGCX5C CCCGGGGGC6 120 

AG0G6TCC06 CTGCCTGOGG G06QAGGGAC OGTGCTGACC AAGATGTACC CX30GCG6CAA 180 

CCACTOGGGG QTOCSGGCACr TAATGOGGAA AAAiSAGCACA GGGGAGTCTT CTTCTGTTTC 240 

TGAGAGAG06 AGCCT6AAGC AGCA6CTGAG AGA6TACATC AGGTGGGAAG AA6CT6CAAG 300 

GAATTTQCTG GGTCTCATAO AAGCAAAGQA GAACAGAAAC CACCAGCCAC CTCAACCCAA 360 

GQCCTTGGGC AATCAGCAGC CTTOBTGGGA TTCAGAGGAT AGCAGCAACT TCAAAGATGT 420 

AG6TTCAAAA GGOUVAGTTG GTAGACTCTC TGCTCCAGGT TCTCAACGTG AAGGAAGGAA 480 

OCCCCAOCIG AAOCAGCAAT GATAAT6ATG 6CCTCTCTCA AAAGAGAAAA ACAAAAGCCC 540 

TAAGAGACTG AGTTCT6CAA GCATCAGTTC TAC6GATCAT CAACA AGATT TCCTTGT6CA 600 

AAATATTTGA CTATTCTGTA TCTTTCATCC TTQACTAAAT TC3GTGATTTT CAAGCAGCAT 660 

CTTCTGOTTT AAACTTGTTT GCT6TQAACA ATTGTCGAAA AGAGTCTTCC AATTAATGCT 720 

TTTTTATATC TAGGCTACCT GTTGGTTAGA TTCAAGGCCC CGAaCTGTTA CCATTCACAA 780 
TAAAAGCTTA AACACAT 

Seq ID NOt 635 Protein sequence 
Protein Accession #i NF_002082.1 

1 11 21 31 41 51 

i I I I I i 

tOlGSEliFLVL LALVLCIiAPR GRAVPLPAGG OTVLTRMYPR GNBHAVGHLM GKKSTGESSS 60 

VSSRGSLKQQ LREYIRWEBA ARNUiGLIEA XBNRNHQPPQ FXALONQQPS WDSEDS8NFR 120 
BVGSKSmSR LSAFGSQREG HNPQLNQQ 

Seq ID NOt 636 DNA sequence 

Nucleic Acid Accession #i. NM_016522.1 

coding sequence t 3 65.. 1299 

1 11 21 31 41 51 

I I I I i I 

GCGGAAGCAG CGAGGAGGGA GCCCCCTTTG GCCGTCCTCC GTGGAACCGG TTTTCOOAGG 60 

.CTGGCAAAAG CX^GAGGCTGG ATTTGGGGGA GGAATATTAG ACTCGGAOOA 0TCTGCGC6C 120 

TTTTCTCCTC CCCQOGCCTC CCGGTCGCCG CGGOTTCACC GCTCAOTCCC CGOGCT06CT 180 

CCGCACCCCA CCCACTTCCT GTGCTOGCCC GGGGGGOOTG TGCCGTGOGG CTGCCGGAGT 240 

TOGGGGAA6T TGTGGCTGTC GAGAATGQGG GTCTGTGGOT ACCTGITCCT GCCCTGGAAG 300 

TGCCTCGTGG TCGTGTCTCT CAGGCTGCTO TTCCTTGTAC OCACAQQAGT 6CCCGTGCGC 360 

A6CG6AGAT6 CCACCTTCCC CAAAGCTAT6 QAGAAG61GA CCSGTCCGGCA GGG6GAGA6C 420 

GCCAOCCTCA GGTGCACTAT T6ACAACCG0 GTCACGCGGG TGGCCTGGCT AAAC06CAGC 480 

ACCATCCTCT ATGCTGGQAA TQACAAOTGG TGCCTGGATC CTCGCGTGGT CCTTCTGAGC 540 

AACACCCAAA CGCAGTACAG CATCGAGATC CAGAAOGTGG ATGTGTATGA 06AGGGCCCT 600 

TACACCXGCT CGGTGCAGAC AGACAACCAC CCAAA6ACCT CTAGGGTCCA CCTCATTGTG 660 

CAAGTATCTC CCAAAATTGT AGAGATTTCT TCAGATATCT CCATTAATQA AGGGAACAAT 720 

ATTAOCCrCA CCTGCATAGC AACTGGTA6A CCA6A6CCTA CGGTTACTTG GAGACACATC 780 

TCTCCCAAAG CGGTTGGCTT TGTGAGT6AA GACGAATACT TGGAAATTCA GGGCATCACC 840 

COGGAACAGT CAQCGGACTA CGAGTGCAGT GCCTCCAATG ACC3TGGCCGC GCCCGTGGTA 900 

CGGAGAGTAA AGGTCACGGT GAACTATGCA CCATACATTT CAGAAGCCAA GGGTACAGGT 960 

CTCCCOOTGG GAGAAAAGGG OACACTGCAO TGTGAAGCCT CAGCAGTCCC CTCAGCAGAA 1020 

TTCCAGTG6T ACAAGGATGA GAAAA6ACTG ATTGAAGGAA AGAAAGGGGT GAAAGTG6AA 1080 

AACAGACCTT TCCTCTCAAA ACTCATCTTC TTCAATGTCT CTGAACATGA CTATGGGAAC 1140 

TACACTTGCG TGGCCTCCAA CAAGCTGGQC CACACCAATG CCAGCATCAT QCTATTTGQT 1200 

CCAGGCGCCG TCAGCGAGOT GAGCAACOGC ACGTCGAGGA GGGCAGGC7G CGTCTGGCTG 1260 

CTGCCTCTTC TG8TCTTGCA CCTGCTTCTC AAATTTTGAT GTGAOTGCCA CTTCCCCACC 1320 

OGGGAAAGGC TGCCGCCACC ACCAGCACCA ACACAACAGC AATGGCAACA CCGACAGCAA 1380 

CCAATCAGAT ATATACAAAT GAAATTAGAA GAAACACAGC CTCATGGGAC AGAAATTTGA 1440 

GGOAGGGGAA CAAAGAATAC TTTGGGGGGA AAAGAGTTTT AAAAAAGAAA TTGAAAATTG 1500 

CCTTQCAGAT ATTTAGGTAC AATGGAGTTT TCTTTTCCCA AACGGQaAGA ACACAGCACA 1560 



430 



wo 02/086443 

ccoaGcrroo acccachKA MScracKica tgcaacsttct ttggtgccag TGrGOGOAG 
SctcSot Sct^coc aqactgcccc cacotggaac attctoqagc K«a«rc^ 

S^S5^TC AGTCCATMA GAQSAACAOA ATGAGACCTI OOKO^ SJ^S 
CCGGCCCARG Ot3TOGCGCTG CGGGCS^CTTT OStMACCOt QCCRCCMaO CGTOXOTTOT 
UfcAACOTOAA ATARAAAOAO CAAAAAAAAA AAAftMMA 

Seq ID HO: 631 Protein sequence ^ 
pxoteln Accession »: HP_057606.X 



PCT/US02/12476 



10 
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1620 
1660 
1740 
1800 



1 11 21 

1 I 
NSVGGYLFLP WKCI.WVSI*R 
NRVTRVAWLN RSTIIiYAGND 
NHPKTSRVHL IVQVSPKIVB 
SEDEYIiBIQG iTBtSQSGDYB 
LQCBASAVF8 AEFQNYKDDK 
LOTTNASIML PSPGAVSBVS 



31 



41 

1 



51 



LLFLVPTGVP VRSGDATFPK AMDNVTVRQG ESATLRCTID 
KWCLDPRWL LSNTQTQYSI EIQNVDVYDE QPYTCSVQTD 
ISSDISIKBG KSISLTCIAT SRPEPTVTWR HISPKAVGPV 
CSASNDVaAP WRRVKVTVN YPPYISSAKG TGVPVGQKOT 
RLIEGKKOVK VBSRPPLSKL IPPNVSBHDY GNYTCVASMK 
HGTSRRAGCV WLLPLLVIiHL LLKF 



60 
120 
180 
240 
300 



Seq ID NOt 636 DNA sequence 

Nucleic Acid Accession #: !m_0122ei.l 

Coding sequence* 203.. 1045 



11 



21 
I 

GATTTGCTCT GCCAGCAOCT GTOGGTGCCG 
ACAGAATACG CGCTCCCTCC CTCCCCCTTC 
CACTCCAGOG GCGACTTTGA GGGATTCCCT 
CCTCATTCGG GGCACTGCGA GTATGGATCT 
ACTTCGAGTT CTCCTGATCT TGTTCCATAC 
GGAAAATCTC TCAGGCCTTT CCACTAACCC 
TGGGAC6ACG TGTCTCS^TGO CAGAGTTTGC 
GQOCAGCAAC TAOGTAGATC T6ATCACAGA 
TGAGGTGAAG GGCCGCTGTG G CCACA qCCA 
GQCATAT6CA CTCAAAATGC TCTTTGTAAA 
GG06ACTTGG AGGCTGAGCA AAGTGCAGTT 
C3JA6ACGCA GTCAGTGCTG GGAAGCACAC 
CACCCCCGCT GGGAAGTCCT ATGAGXGTCA 
TGATCOGCAG AAGACGGTCA CCATGATCCT 
TATCTCAGAT TTTGTCrTCA GTGAAGAGCA 
GGAAGAAACC TTGCCCCTGA TTTTGGGGCT 
CGOGATTTAC CACGTCCACC ACAAAATGAC 
ATCCCAGTAT AAQCACATGO GCTAQAGGCC 
CCAACTGGAT CAGGTAGAAC AACAAAAGCA 
CATAGCTACA ATCAAACAG6 C CTGG GTATC 
AACCCACGGA AGGGGGAGAC TCTTTCGGAT 
ATGCTGGGOA GGAGGGGAGG AGGGTCTCA6 
TOACTCTCCA AAOAOCftATA AATGCCACTT 
TTGAAAACAT GCTTCTTTGA GGAGGAAACC 
TGCTCCCTTG GACACAGCTG GCTTATCCTA 
TCATGCTCCC TGCAGCAAGA CCCCTGAAAG 
6TTTAGTGAT TGTCTTGGGA. ATGTTTCACT 
AAAAOGACTA ATOTAACTAT GCAflAGTTOT 
GGQQGACCTQ AAGAATC3AT CTGTOTGAQT 
TTCTCTGGC 

Seq ID NO: 639 Protein sequence 
protein Accession #t NP_036393.1 



31 

1 

CGCTCGACAC 
TCTGTCCCCC 
CTCTG6CX3GC 
CCAAGGAAGA 
AATGGCTCAA 
TGAAAAAGAT 
AGCCAAATTT 
ACAOOCOSAT 
GTCGGAGCTG 
GGAAAGCCAC 
TGTCTACGAC 
AGCCAACT06 
AGCTCAACAA 
6TCT6GGGTC 
TAAATGCCCA 
CATCTTGGGC 
TGCCAACCAG 
OTTAGGCAGG 
CTTTTCCATC 
TGA6GCTTGC 
TTGTAGGGTG 
ACAGCTTTCG 
GGAGCTGTAT 
CCTTTAGGTT 
TACAGTTGTC 
TGATTCATGC 
GCTACCGGCA 
TTGOACTTCT 
CTOTTTTTCA 



51 



CGAGTCCTAG 
GCCTCTCGCT 
CTCTGCAOCA 
6Q6GTCCCCA 
ATCATGGCAG 
ATATTTGTGG 
ATTGTACCTT 
ATCSGCATTGA 
CAAGTGTTCT 
AACATGTCCA 
TCCTCGGAGA 
CACCACCTCT 
ACCATTTCAC 
CACATCXAAC 
GTGGATGAGC 
CTCXJTCATCA 
GTGCAGATCC 
CACCCCCTAT 
TTGTACACGA 
TTGGCTTGTG 
AAATG6CAAT 
TGCTCATGGT 
CTGGOCXXAA 
CAGAA6AATA 
AATGCACAGA 
TTCTQGCTGG 
TCCAG08ACT 
TCGTGTQCCA 
AAATGAAATA 



CTAGGOGCTC 
CACCCOGGCC 
GCACAGCOQG 
GCATCQACAG 
AACAAGAAGT 
TGCX3GGAAAA 
ATGATGTGTG 
CC06GGGA6C 
G6GTGGATCG 
AGGGACCTGA 
ATVACCCACTT 
CTQCCTTG6T 
TGGCCTCTAG 
CXTTTGACAT 
GGGAGCAACT 
T66TAACACT 
CTCGGGACAG 
TCCTGCTCCX: 
GATACACCAA 
TCCATGCTTA 
TATTCTCTCC 
GGCTTGGCTT 
AGTTTAGGGA 
TGGGGTQCTT 
GAATACAACC 
CATTCIGCAT 
GCAOCACCAG 
GGTCCAAOTC 
AAACACACTA 



11 



51 



60 
120 
IBO 
240 
300 
3 60 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620' 
1680 
1740 



21 31 41 J 

iojLQGRGVPS iDRtRVLLMI. FHTMAQIMAB QBVENLSGL8 WPEH)1FVV ^^^^0^ 
efaSkpxvpy DVWASNYVDL ITBQADIALT rgabvkgrcg hsqselqvpw vdray^ 

PVKBSHNMSK GPEATWRLSK VQPVYDSSEK THPKDAVSAG KHTANSOTM J^VTOA^SY 
BC^OTSL ASSDPQKTVT MILSAVHIQP FDIISDFVFS EBHKCPVDER EQDEEMLI 
IiGLILGLVIM VTLAIYHVKH KMTAKQVQIP RDRSQYKHM8 

Seq ID NO: 640 DMA sequence 

Nucleic Acid Accession »: iJM_002993.1 

Coding sequence: 64.. 40 8 



60 
120 
180 
240 



GGCACGAGCC 
ACTATGAGCC 
GOGCTGCTCG 
GTCTCTGCTG 
CCCAAAACGA 
GT6GTAGCCT 
AAGAAAGTCA 
ACCATGCATC 
CAGTAAGAAT 
QAAGAGTGTG 
CTAATATAOT 
CAATTGACCA 
TGAAGATAAC 
ATTTOGTATG 
ACTCACTCTT 



11 

1 

AGTCTCCGCG 
TCCCGTCCAG 
CGCTGCTGCT 
TGCTGACAGA 
TTG6TAAACT 
CCXnX3AA6AA 
TCCAGAAAAT 
ATAAAATTGC 
AAGAAGGAAG 
GGGOAAAGCC 
ATTTOCACTA 
TATTGTGAGC 
TATTGTATTT 
GAAATAATGT 
CTCATAAAAT 



21 
1 

CCTCCACCCA 
CCX5CGCGGCC 
CCTGCTGACG 
GCTGCGTTGC 
GCAGGTGTTC 
0GQQAA6CAA 
TTTGGACAST 
CCAGTCTTCA 
GGTTGGTTTT 
TAGGCTTCTC 
TrTACTGTTA 
AAA6AATC3^ 
CTATCATACA 
TTTATTAGTG 
AGGAAATATT 



31 
I 

GCTCAOGAAC 
CGTOTCCaSQ 
C0GCCX3GGGC 
ACTTGTTTAC 
CCCQCAQGCC 
GTTTGTCTGG 
GGAAACAAGA 
GGGGAGCAGT 
TTTCCATTTT 
CCTGAAGTTT 
TTTTACCTGA 
TGGTTATTAG 
TTCCTTAAAG 
TGCTGTTGAG 
TTAGTTCTGT 



41 
I 

C!CG08AACCC 
GTCCTTCGGG 
CCCTCGCCAG 
GC6TTACGCT 
CGCAGTGCTC 
ACCCGGAA6C 
AAAACTGAGT 
TTTCTGGAGA 
CTACATGGAT 
AC3VGCTCAGC 
TAAGTTATTO 
TCTTTCAATG 
TCTTACOGAA 
GGAGG3ATCC 
TTTCtTGGQG 



51 
I 

TCTCTTGACC 
CTCCTTGTGC 
CGCTGGTCCT 
GAGAGTAAAC 
CAAGGT6GAA 
CCCTTTTCTA 
AACAAAAAAG 
TCCCTG GACC 
TCCCTACTTT 
TAATGAAGTA 
AACCCTTTGG 
AATATT6AAT 
AAGGCTGTGG 
TGTTGTTCTT 
AATATGTTAC 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 



431 
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WO 02/086443 

TCTTTACCCT AGGATGCTAT TTAASITSIA CXGIATXAGA 
TTWrCIGTGC ASMVTATATT TOCmTTOV OKATTTCTAA 
CrMTATATT CTCTTCCTAT GGrrrrAOAT eTTTOMGTC 
CATGATTTAC TCATTAAACT TTGATTTTQT ATGCTATTTT 
ATTCIGGTCA CTAAATATAC ACTTTAOATA QATGAASAAB 
TBATTQCTAA TTTACATAOA AATGTATTCT CTTGOTTTTI 
AATOAICTOT GCTCTGCAAA eTTTTOAAAA XATAITTOAA 
CATTTAGTCC TCAAAATATA TACAGCATTO CTAAOATTTT 
TTTAAAOOTT TTQACCATIT T6TTATCA6S AATTATACAT 
AAAITCCACT TTTATTTTTT CCrGTGTGTC ATSTTGGTTT 
TGOAOAAACA ATAAAAOATT TCTAAACCAA AAAAAAAAAA 

8«q ID NOt <41 Protein sequence 
Protein Aceeaslon iip_002984.i 

21 



PCT/US02/12476 



ACACTGGGTG 
AAATTTAAOT 
TTCTTAOTAT 
TTCA.CTATAQ 
CCCAAAAACA 
TAAATAAAAO 
CAATTTQAAT 
CAGATATCTA 
GTATCACATT 
TT6GTACTTG 



TGTCATACCG 
TCT8TAAQGG 
66CATAATGT 
GATGACTATA 
GATAAATTCC 
CAAAATTAAC 
ATAAATTCAT 
TTGTGGATCT 
CACTATATTA 
TATTGTCATT 



31 



41 



51 



I 11 

I I I I I I 

HSIiPSSRAAR VPGPSOSIiCA LUUiLIiLLTP FOPLASAGPV 8AVLTELRCT OiRVTtRVNP 
fCTIGKLQVFP A6PQCSKVEV VASLKHGKQV CLDPEAPFLK KVZQKILDSG NKKN 

Seq ID NOi 642 DNA sequence 

Nucleic Acid Accession «: llM.013271.1 

Coding sequence: 27.. 809 ' 



TCCGGAGCCA 
CCGGGGGCGT 
TCTGCGCGCG 
AGACTQGCGC 
T6CAGGAGCT 
66GC0QAG0C 
TCTGGG6GGC 
CTGCA6C6CA 
QCCAQCTTQT 
ACGACQGCCC 
CCGAGCT6TT 
TGGCAGCCCC 
CTQAGG6C6T 
T6CCT6CAO0 
CAGAAGTGCC 
TTA0CCCG6C 
GATCIGAGC 



11 

1 

GGCTCGCTCG 
CGGCCTTTTG 
GCCX3GTAAA0 
TCCTOGCOGC 
GGCGC36GGCQ 
GCAGGAGGCT 
CCCCCGCAAC 
GCTCXaCTOGC 
CCC06GGCCC 
G6CG66CCCX3 
GAGGTACTTG 
GCGCCGCCTC 
GCTGGGGGGG 
CC360LTCTTG 
CCOGCCATCC 
CAGCCAGGCC 



21 

1 

G6CA6CATGG 
GTGCT6CTGC 
GAACCC060G 
TTCOGGGOGT 
CTGG06CATC 
OAGGATCAGC 
TCTGATCGGG 
GCTCTGCTCC 
GTCCCOGCCQ 
GATGCTGAGG 
CTGGQAC06A 
CGCCGTGCCG 
CTGCTGOGTG 
CCACCCTOA0 
0GCCACGA6G 
TCTCACGOSA 



31 
1 

CGGGGTCGCC 
TGCTCGGCCT 
6CCTAAGG6C 
CAGTGCC003 
TGCTG6AGGC 
AGGOGOGGGT 
CTCTGGGCCT 
G06CC0GCCT 
08Q0GCTC0Q 
AGGCAGGCGA 
TTCTT60GGG 
CCGACCAOGA 
TGAAACGCCT 
CACTQCCCGO 
ACTTCTCCOC 
GQATOCCTAC 



41 

I 

GCTGCTCTGQ 
GTTTCGGCC6 
AG0GTCTCC6 
AGGTGAGG06 
GQAACGTCAG 
0CT6G06CAG 
GGACGACGAC 
TGACCCT6CC 
ACCCCGGCCC 
OGAGACACGC 
AAGC60GGAC 
TGTGGGCTCT 
AGA6A0CC0Q 
ATCCOQTdCA 
GCCAGCACGT 
OCCCTGQCOC 



Seq ZD NOt 643 Protein sequence 
Protein Accession NP_037403.1 

I 11 21 31 41 SI 

I I t ) I I 

MAGSPLLWOP RAGGVGLLVL LLLGLFRPPP ALCARPVKBP SGLSAASPPL AETQAPRRPR 
R8VPR6BAA6 AVQELARALA HLLBABRQER ASAEAQBABD QQARVLAQItL RVHGAPRNSD 
PALGLDDDPO APAAQLARAL IiRARZiDPAAL AAQLVPAPVP AAALRFRPPV YDDGPAGFDA 
EBACH3ETPDV DPEIiLRYLLG RILAGSADSE GVAAPRRLRR AADBDVQSEL PPEQVLOALL 
RVKRLETPAP OVPARRLLPP 

Seq ID KOi 644 DNA sequence 
Nucleic Acid Accession «: NM^002214 
Coding sequence: 6B1..2990 



CCCAGAGCC6 
CTGCCGACTT 
GTTGGCCTCC 
TCCCCTG6AC 
TAGG6TGGTT 
CTAAGCTGAT 
TGTCCCX3GAQ 
TGGCCX3TCGA 
6GCCGTAGGG 
COSAGCCGCG 
GGCCCOGAGG 
GGGGCGGGCT 
TCTGCCTGCA 
CACTTGTTCT 
CCTGTGCCAG 
TTTCAGGTGG 
GCTCAGTTGA 
TTAATA CCCA 
ATTTTATGCT 
ATGTCTCAGC 
CTAGAAAAAT 
AAACAOTTTC 
ACAATTTAGA 
TCACTGA6TT 
AAGGAGGTTT 
AAGAGGCTAA 



11 

I 

GTCTTTGCXX: 
CTGCCCACCT 
CTCGCCGGCG 
TCCCCCCCAG 
TTATGCAGCA 
CAGQCTGOGO 
AGGAGGTGCT 
GCCCTGAGAT 
GGGTCOQCCT 
TCGCCCGGGA 
GTTTTGCATT 
AAAOGACCGG 
TGGACTGGGC 
GTGCCTTGCG 
ATCAAGAA6T 
TTCAATAGAA 
GGTGACACCA 
GAAAGTTCAT 
ATCAATGCAC 
GGCATTTTTC 
ACCATACATT 
CTGCATGCCT 
TGAGAAAGCA 
TGACX3CCATG 
AA6ATTGCTG 



21 
I 

TTGCTGGCAT 

GcracTCCGC 

GTGGAAGCAA 
TACCCTCCCA 
CTTCGGGCTT 
GAAGCCCCAC 
AGCCCTTGCA 
TCTCGOGGAG 
GCOGAGCGGT 
GCTAGGCCTG 
GGC0GA6CCC 
ATGTG0G6CT 
OGAGGTCCOO 
CAAGGTGAAG 
CTGGGTCCAO 
GAAGGTTGTG 
TACCCATCTG 
GGAGAAGTGT 
CCTCTGAAGA 
AATAATATAG 
TCCCGTGACT 
AGCATCCACC 
CCCCATGGAT 
GTrCATAGAC 
CTTCA6GCAG 
CTGGTGATQA 



31 
I 

CCOGAGCTTC 
AGACQGGGCT 
CTGCGCTGAT 
CAGATCCAGC 
TGTTTQGGTT 
GGGCTGGAGA 
QAGCCCTCTC 
ACOG06GGAC 
GCCCGGGCCC 
CGGAAAACGT 
GCGTCGGGAA 
GGGCCCT6GC 
CCTCGTTCCT 
ACAATA6ATG 
AATGTGGATG 
ATATTGTTTC 
TGCKFGTTAT 
CTATCCA6CT 
AATATCCTGT 
AAAAATTAAA 
TTOGTCTTGG 
OCGAAAGGAT 
ACATCCATGT 
AGAAGATCTC 
CT6TCTGTGA 
CAGATCAGAC 



41 

I 

CTCCCTTQCC 
GCAAAGCTGC 
TQATG06CCA 
ATCACCCAGT 
TQATTGTGTT 
GAAACAAAAG 
TCCAGTCGCC 
C06COGTGCC 
GCTTACCTGC 

ccta60gaca 
ggcagccagg 
tttttttacc 

CTGG6CAGCC 
TGCATCTTCA 
GTOTGTTCAA 
CAATTTAATA 
AATACOCACT 
GOGTCCAGGA 
GGATCTTTAT 
TTCXX3TTGGA 
ATTTGGCTCA 
TCATA ATCAA 
GCTOTCTTTG 
TGGAAACATA 
AAGTCATATC 
GTCTCATCTC 



51 
I 

AGCCAGGACG 
AACTAAT66T 
CAGACTTTTT 
GAATGTACAT 
TGGCTCTTCG 
CTCTTTTCTT 
6CGGGGCCCT 
GAGCCGGGA6 
ACCGCTTGCT 
CTCGCCCGOG 
CGGOQG6CX3C 
GCTGCATTT3 
TGGGTGTTTT 
AATGCAGCAT 
GAGGATTTCA 
AGCAAAGGCT 
GAAAATGAAA 
GOOGAA<KTA 
TATCTTGTTG 
AACGATTTAT 
TACGTTGATA 
TOCAGTGACT 
ACAGAGAACA 
GATACAGCAG 
GGATGGOGAA 
GCTCTTGATA 



960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 



51 
I 

GGGCCGOGGG 
CCCCCCGCGC 
CCCTTGGCT6 
GCGGG66CGG 
GAG06GGGGC 
CTGCTGOGOG 
CCCGACGCX3C 
GCCCTAGCAG 
CCX36TCTAC6 
GAOBTGGACC 
TCCGAGGGGG 
GAGCTGCXXX: 
GOQCCCCAGG 
CCCTOGGAlCC 
OCAGAGCAAC 
ACAATAACAT 



60 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 



60 
120 
180 
240 



60 
120 
160 
240 
300 
360 
420 
460 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 



432 



5 

10 

15 

20 

25 

30 

35 

40 

45 

50 

55 

60 
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70 

75 

80 

85 



WO 02/086443 

GCftAATTGGC AGGCATftOTG 
AC3QTCAAATC GACAACCATG 
ACAACAACAT TAATGTCATC 
TTCTACCCCT CTTGCCAGGC 
ATAATTTGGT AGTGGAAOCC 
ACCACGTACA AGGCATCTAT 
CAGGCATOGA AG6ATGCAGA 
TTACAATGAA AAAATGTGAT 
GTTTTAATOA AAC06CTAAA 
ACAGAGQACC TAAAGGAAAG 
GTGATGAQAA TAAA TGTCA T 
ACAAGGATCA GCCTGTTTGC 
ACaAAATTAA GCTTGGAAAA 
CATATCACCA TGGAAATCTG 
GCTTCAGTGG CTGGGAAG6T 
TCAATTCAAA GQGCCAAGT6 
GCACCGATCC CAGGAGCATC 
GCAAGGAAAA CTGGAATTGT 
TTQATCAOIG CftAAACCTCA 
CAGAATGTTT CTCCAGCCCA 
TCTTGATTGG GTTGCTTAAA 
ATAAAATTAA GTCCTCATCA 
T6CAAAGTGT TTGCACAAGA 
TQGATATCAG CAAATTAAAT 
TTAAACACTT AATGGGAAAC 
AAGTCACAGG AGGAGACAAA 
GAAGACTGAC AAGTATCCTC 
AAAATGTGTC TTACTACTGT 
ACTTATTTAO ATCAGCATAG 
TACCTGTTAT CCXTTACGCTT 
CACTACAAGG QTACAGTAAT 
TATATTCTAA GQTTGCCSiAA 
ATGAATAAAT GATTCGTQTT 
AAAGATTATT GCTTTTTAAA 
TTTGCAAGAT GGATACTAAT 
TTTTTACAGG ATAA6TTTAT 
TACTQCCATA AAAAACTAAT 
GAATGTTAA 



PCTAJS02/12476 



GTGCCCAATG 
6AACACGCCT 
TTTGCAGTTC 
ACCATTGCTG 
TATCAGAAGC 
TTTAACATTA 
AAC6T6ACGA 
GTCACAGGAG 
ATTCATATAC 
TGTGTAGATG 
TTTGATGAAG 
AGTGGTOGAG 
GTGTATGGAA 
TQTGCTGGQC 
0ATCX3ATGCC 
7GCAGTGGAA 
GQCCGCTTCT 
ATGCAATGCC 
TQTGCTCTCA 
AQCTACTTGA 
GTCCTGATCA 
GATTACA6AG 
GCAGTCACCT 
GCTCATGAAA 
TGGAATTGTT 
TTGCTCACGG 
ATCATQATGT 
TTGA6ACTAG 
AATGTAGATC 
CCCAGAGAGA 
CCCTGCACTO 
CACTTCAACA 
TCACTCTTTC 
GTGTGTAGTT 
TCCAGCATTC 
GTATGTCACA 
AATACAATGT 



ACGGAAACTG 
CACTA66CCA 
AAG6AAAACA 
GTGAAATAOA 
TCATTTCAGA 
CCGCCATCtG 
GCAATGATGA 
GAAAAAACTA 
ACR GAAAC TG 
AAACTTTTCT 
ATCAGTTTTC 
GABTTTOTOT 
AATACCGT6A 
ATGGAGAGTG 
AGTGCCCTTC 
GAGGCACGTG 
GTGAACACTG 
TTCACCCTCA 
TGGAACAACA 
GAATATTTTT 
TTACSACAGGT 
TGTCAGCCTC 
ACCGACX3TGA 
CTTTCAGGTG 
AATAATTGCr 
TCATGCCA6T 
V3ACTC!ACATA 
TGTOGTTGTA 
CTCTGAAQAG 
ACAATGCTGT 
GACATGTGAG 
GTT6GTG(?IT 
AAGAGGTGAA 
TTATGCATGT 
TCTCCXCTTT 
6ATGACTG6A 
CACTTTATCA 



TCATCTGAAA 
ACTTTCaGAG 
ATTTCATTGG 
ATCAAASGCT 
AGTGAAAGTT 
TCCAGATGGG 
AGTTCTTTTC 
T6CAATAATC 
CAGCTGTCAG 
AGATTCCAAG 
TTCTGAGAGT 
TTGTGGGAAA 
AAAGGATGAC 
T6AAGCAGGC 
AGCAGCAGCC 
TGTGTGTGGA 
CCCCACCTGT 
CAATTTGTCT 
GCATTATGTC 
CATCATTTTC 
GATACTACAA 
AAAAAAG6AT 
GAAGCCTGAA 
CAACTTCTAA 
CCTAAAGATT 
TGCTGGTTGT 
GCIGCTGACI 
GCACTTTACT 
CACTGATTAC 
GAGAGAGTTT 
6AAAAAAATA 
GAATAGACAA 
CAGATACAAC 
GTGTTTATGG 
GCCTTTATQT 
TTAATTAAGT 
GAATACTAGT 



AACAACGTCT 
AAATTAATAG 
TATAAG6ATC 
GCAAACCTCA • 
CAGGTG6AAA 
TCCAGAAA6C 
AATGTAACAG 
AAACCTATTG 
TGTGAGGACA 
TGTTTCCAGT 
TGCAAGTCAC 
TGTTC»TQTC 
TTTTCTTGTC 
AGATGCCAAT 
CAGCACTGTG 
AG6TGTGAGT 
TATACAGCXrr 
CAGGCTATAC 
GACCAAACTT 
ATAGTTACAT 
TGGAATAGTA 
AAGTTQATTC 
GAAATAAAAA 
AAAAAGArrr 
ATAATTTTAA 
ACACTOGAAC 
TTTTCAGAGA 
GTAATATATA 
ACTTTACAGG 
AGCATTGTGT 
ATCTGGCAAG 
GAACAGCTAG 
CTTAATCTTA 
TtTGCTTATT 
TTTGTTTTCT 
GCTAAGTTAC 
TTTAAAAGCT 



1620 

1680 

1740 

IBOO 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

22 SO 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2B20 

2880 

2940 

3000 

3060 

3120 

3180 

3240 

3300 

3360 

3420 

3480 

3540 

3600 

3660 

3720 

3780 



Seq ID NO; 645 Protein sequence 
Protein AccesBion Ht NP^00220S 



MCGSALAPPT 
LGPEOGWCVQ 
GSVSIQUIFG 
SSDFKL6FGS 
VHRQKIS6NI 
VPNDGNCHIiK 
TIAGEIBSKA 
NVTSNDEVLF 
CVDETPUJSK 
VYGKYCEKDD 
CSGRGTCVCG 
CALMBQQHYV 
DYRVSASKKD 



11 

I 

AAFVCLQNDR 
EDFISG6SRS 
AEAHFMLKVH 
YVDKTVSPYI 
DTPEGGFDAM 
NNVYVKSTTM 
AMLNHLWEA 
HVTVTMKKCD 
CFQCDEmCCH 
PSCPYHHGNL 
RCECTDPRSI 
DQTSECPSSP 
KLILQSVCTR 



21 

I 

RGPASFLWAA 
ERCDIVSHLI 
PLKKYFVDLY 



LQAAVCESHI 
EHPSLGQLSE 
YQKLISEVKV 
VTaGKNYAII 
FDEDQFSSES 
CAGHGECEAQ 
GRFCEHCPTC 
SYLRIFPIIP 
AVTYRREKPE 



31 

I 

WVFSLVLGLG 
SKGCSVDSXE 
YI.VDVSASMH 
CSDYNLDOtP 
GWRKEAKRLL 
KlilDMNIUVX 
QVEKQVQGIY 
KPIGFKBTAK 
CXSKKDOPVC 
RGQCPSQWEG 
YTACXENNNC 
IVTPUGIiLK 
EIXMDISKLN 



41 

! 

QGEDNRCASS 
YPSVHVIIPt 
KMIEKIiNSVG 
PHGYIHVLSL 
LVMTDQTSHL 
FAVQGRQFHN 
FNITAICPD6 
IHIHRNCSOQ 
SGRGVCVCGK 
DROQCPSAAA 
MQOiHPHliliS 
VLIIROVIIA 
ABETFRCNF 



51 

! 

NAASCARCXiA 
EKBINTQVTP 
NDLSRKMAFF 
TEUITEPBKA 
ALDSKLAOIV 
YKDLLFUjPG 
SRICPGMBGCR 



CSCHKIKLGK 
QHCVMSKGQV 
QAILDQCKTS 



60 

120 
IBO 
240 
300 
360 
420 
460 
540 
600 
660 
720 



Seq ID KOt 646 DNA sequence 

Nucleic Acid Accession fti NM_00331B.l 

Coding sequence: 1..2574 



I 
I 

ATGQAATCCO 
AOAGACATTA 

atttctgctg 
aacccagagg 

QATGCTCTTT 
QATAAATATG 
GCTATTCAAO 
AAATTTGCTT 
AAAAGTAAAC 
OAAATTGCCC 
AAGAATTTAT 
CATTTACAOA 
TTATATGGAG 
CAAACTAACA 
AGCCCAGATT 
ACCTCTAGAT 
TCCT6TGAAT 
TCAOATGAAA 
GAATCAAOTC 
GAGAGTAACC 
GCTGCATCTT 
AAACATACCA 
ACATCTAAAT 



11 
I 

AGGATTTAAG 
AAAATAAGTT 
ATACTACAGA 
ACTGGTTGAG 
TAAATAAATT 
GCCAAAATGA 
A6CCAGATGA 
TTGTTCATAT 
AACTTCTTCA 
TGGGGAATTT 
CAGCATCTAC 
ATAGGAACAA 
AGAACATGCC 
AAACTAAACA 
GTGATGTGAA 
CAGAATGCCG 
TAAGAAATTT 
AQA6TTCTGA 
TTCTABCTAA 
AOAAACAGTG 
CAAATCACTG 
CTTTTGAGCA 
GGTTTGACCC 



21 

1 

TOGCAGAOAA 
TAAAAATGAA 
TAACTCGGGA 
TTTGTTGCTC 
GATTGGTCX3T 
QAGTTrTGCT 
TGCACXSTGAC 
ATCTTTTGCA 
AAAAGCTGTA 
AAACCrCCAA 
GGTATTAACT 
CAGTTGTGAT 
ACCACAAGAT 
QTCATGCCCA 
GACAQATGAT 
AGATTTGGTT 
AAAGTCTGTT 
ACTTATTATT 
ATTAGAAGAA 
GCAATCTAAG 
GCAGATTCCG 
ACCTGTCTTT 
AAAATCTATT 



31 
I 

TTGACAATTG 
QACCTTACTG 
ACT6TTAACC 
AAACTAGAGA 
TACAGTCAAG 
AGAATTCAAG 
TACTTTCAAA 
CAATTTGAAC 
QAACQT6GA0 
AAAAA6CAGC 
GCCCAAGAAT 
TCCS^GAGGAC 
GCA6AAATAG 
TTTGGAAOAO 
TCAGTT6TAC 
GTGCCTGGAT 
CAAAATAGTC 
ACTGAITCAA 
ACTAAAGAGT 
AGAAAGTCA6 
GA6TTAGCX:C 
TCAGTTTCAA 
TGTAAGACAC 



41 

1 

ATTCCATAAT 
ATGAACTAAG 
AAATTATGAT 
AAAACA6TGT 
CAATTGAAGC 
TGAGATTTGC 
TGGCCAGAGC 
T6TCACAAGG 

ca wBTACC Acr 

TGCTTTCAGA 
CATTTTCCGG 
AGACTACTAA 
GTTACCGGAA 
TCCCAGTTAA 
CTTGTrTTAT 
CTAAACCAAG 
ATTTCAAGGA 
TAACCCTGAA 
ATCAAGAACC 
AGTGTATTAA 
GAAAAGTTAA 
AACAGTCACC 
CAA6CAGCAA 



51 

1 

GAACAAAGTG 
CTTGAATAAA 
GATGGCAAAC 
TCCGCTAAQT 
GCTTCCCCCA 
TGAATTAAAA 
AAACTGCAAG 
TAATGTCAAA 
AGAAATGCTG 
GGAGGAAAAG 
TTCACTTGG6 
AGCCAGGTTT 
TTCATTGAGA 
CCTTCTAAAT 
GAAAAGACAA 
T6GAAATGAT 
ACCTCTGGTG 
GAA^TAAAAGG 
AGAGGTTCCA 
CCAGAATCCT 
TACAGA6CAG 
ACCAATATCA 
TACCTTGGAT 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 

loao 

1140 
1200 
1260 
1320 
1380 



433 



5 

10 
15 
20 
25 
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80 
85 



WO 02/086443 

GATTACATGA GCTQTTTTAG AACTCCAGTT GTAAAGAATG ACTTTCCACC TGCTTGTCAG 
TTGTCAACftC CTTAIGGCOl ACCTQCCIGT TTCCA6CAGC AilCAGCATCA AATACTT6GC 
ACTOCACTTC AAAATTTACA GGTTXTAGCA TCTTCTTCaG CAAATGAATG CATTTCGGTT 
AAAGGAAGAJl TTTATTCCAT TTTAAA6CA6 ATAGGAAGTG GAGGTTCAAG CAAGGTATTT 
CAGGTGTTAA ATGAAAAGAA ACAGATATAT GCTATAAAAT ATGTGAACTT AGAAGAAGCA 
QATAACCAAA CTCTTGATAG TTACC3GGAAC GAAATA6CTT ATTTQAATAA ACTACAACAA 
CSVCAOTGATA AQATCATCCG ACTTTATGAT TATGAAATCA CGGACCAGTA CATCTACATG 
GTAATGOAGT GTGGAAATAT TGATCTTAAT AGTTG&CTTA AAAAGAAAAA ATCCATTGAT 
CCATGGQAAC GCAAQAGTTA CTGGAAAAAT ATGTTAGAGG CAGTTCACAC AATCCATCAA 
CATQGCATTG TTCACAGTGA TCTTAAACCA GCTAACTTTC TGATAOTTGA TGGAATGCTA 
AAGCTAATTG AnTTCGGAT TSCAAACCAA ATGCAACCAG ATACAACAAG TGTTGTTAAA 
GATTCTCAGG TTGGCACAGT TAATTATATG CCACCAGAAG CAATCAAAGA TATGTCTTCC 
TCCAGAGAGA ATGGGAAATC TAAGTCAAAG ATAA6CCCCA AAAGTGATGT TTGGTCCTTA 
GGATGTATTT TGTACTATAT GACTTACGGG AAAACACCAT TTCA6CAGAT AATTAATCAG 
ATTTCTAAAT TACATGCCAT AATTGATCCT AATCAT6AAA TT6AATTTCC CGATATTCCA 
GAGAAAGATC TTCAAGATGT GTTAAAGTGT TGTTTAAAAA GGGACCCAAA ACAGAGGATA 
TCCAITCCTG AGCTCCTGGC TCATCCCTAT 6TTCAAATTC AAACTCATCC A6TTAACC3A 
ATGGCCAAGG GAACCACTGA AGAAATGAAA TATGTTCTOG GCCAACTTGT TGGTCTOAAT 
TCTCCTAACT CCATTTTGAA AGCTGCTAAA ACTTTATATG AACACTATAG TGGTGGTGAA 
AGTCATAATT CTTCATGCrC CAAOACTTTT OAAAAAAAAA GGGOAAAAAA XKA 

•8eq SD NOt 647 Protein sequence 
Protein Accession #i 1IF_003309.1 



PCT/US02/12476 



NPSDHLSXjLL 
AIQEFDDAKD 
EIALRNZiNXiQ 
LYGENMFPQD 
TSRSECRDLV 
ESSLLAKLEE 
KBTTFEQPVF 
LSTPYGQPAC 
QVIiNEKKQiy 
VMBCGNIDIiH 
KLIDFGIANQ 
GCILYYMTYG 
SIPELLABPY 
SHHSSSSKTF 



11 
I 

LTIDSXMNKV 
KLEKNSVFLS 
YFQMARANCaC 
KKQIiLSEEEK 
AEIGYRNSIiR 
VPGSKPSGND 
*rKEYQEPEVP 
SVSKQSPPXS 
PQQQQHQILA 
AIKYVNLEEA 
SWLKKKKSID 
MQPDTTSWK 
KTPFQQIZNQ 
VQIQTHPVNQ 



21 
I 

RDIKKKFKNE 
OAIiUTKLIGR 
KFAFVHISPA 
XNLSASTVLT 
QTNKTKQSCP 
SCELHNLKSV 
ESNQKQHQSK 
TSKHFDPRSZ 
TPLQKLQVLA 
DNQTIjDSYHK 
PWERKSYWKN 
DSQVGTVNYM 
ISKLEAZIDP 
HAK3TTEEKK 



31 

1 

DLTDELSLNK 
YSQAIBALPP 
QFELSQGHVK 



41 

1 

ISADTTDNSG 
DKYGQNESFA 
KSKQLLQKAV 



FGRVFVNI.Uf 
QNSHFKEPLV 
RKSECINQNP 
CKTPSS13TU3 
SSSAHECISV 
EIAYLUKLOO 
MIiEAVRTIHQ 
PPEAIKDMSS 
NHEIEFPOIP 
YVLGQLVQLK 



SPDCOVKTDD 



AASSNBWQIP 
DYMSCPRTPV 
KGRIYSXLKQ 
H8DKIIRLYD 
EGIVHSDLKP 
SRQ7GKSKSK 
EKDLQOVLKC 
SPNSILKAAK 



51 
I 

TVNQIMMfWN 
RIQVRFAEajC 
ERGAVPLEMb 
SRGQraCARP 
SWPCFMKRQ 
TDSITIiKKKT 
ELARICVIirrEQ 
VKNDFPPACQ 



Seq ID NO: 648 DNA sequence 
Nucleic Acid Accession NM_015S07 
Coding sequence i 241 . . 1902 



CCQCAGAGGA 
C6GGCCTG0C 
06AGTGGA6C 
GGGTCCGGCC 
ATGCCTCTGC 
GGGAACGCGG 
TGTCACTAT6 
TGTGAAGCTA 
AGATGCTTTC 
AAACCCCGGC 
CTCAGTGGCC 
ATAAACTGTC 
TCAGGACTCC 
GGTAAAGTCA 
AAATGTCACA 
AATGAATGTA 
GGGTCCTTCA 
ATCCCTQAAA 
AAGAAGTTGC 
CCAGAACCCA 
ATAGnrCCA 
GAGGGGCTTG 
AGCCTGOGAG 
CTGGTCCAAA 
GACTGCAGCT 
TGGAATCCTG 
GGTCACAAGA 
AACTTCTGTT 
TTTGTGAAAA 
TQ6AAGACA6 
GAAGCAGAAC 
TCAG6CTTAT 
TTGACrTTGT 
TTAQAATTAC 
TCTT8TATAA 
TTTCTGAATC 
CAGTATATCT 
TAGAAAAAAA 
TATGACATCA 



11 

! 

GCCTCGGCCA 
GCG G T G CCTG 
G6AGGACC0G 
GGCGCCCTCC 
CCTGGAGCCT 
CCAGTGCAAG 
GAACTAAACT 
CATGGGAACC 
CAGGATACAC 
CATGCCAACA 
ACATGCTCAT 
AGTACAGCTG 
OCCTGGCCOC 
TCTGTCCCTA 
TTGGTTTCX3A 
CTATGGATAG 
AGTGTAAATG 
ATTCTGTGAA 
TT6CTCACAA 
CCAG6ACTCC 
GAGGCGG6AA 
AGGATGAGAA 
GAGATGTGTT 
GQAAAGCGCT 
TCAA.TCATGG 
CTGATGC3AGA 
AAGACATTGG 
TGCTCTTTGA 
ACAGTAACAA 
GGAAAATTCA 
GTGGCftAGGG 
GTCCAGATAG 
ATGTCAGTTC 
TAGCTGAAAA 
QATATGCCAA 
TTTCCACATT 
GATTTGTATA 
AGCACA6AGA 
AAGATAGACT 



21 

I 

G6CTA6CCAG 
GCCTCCCCTC 
AGCGGCTGAG 
OGAGGGGGGC 
TGCGCTCCCG 
GCATCACGGG 
GGCCTGCTGC 
TGGATGTAAG 
CG66AAAACC 
CAGATGTGTG 
GCCAGATGCT 
T6AAQACACA 
AAATGGAAGA 
CAATCGAAGA 
ACT6CAATAT 
CCATACGTGC 
CAAGCAGGGA 
GGAAGTCCTC 
AAACAGCATG 
TACCCCTAAG 
CTCTCATQGA 
AAGAGAAGAG 
TTTCCCTAAG 
AAGTTCCAAA 
GATCTGTOAC 
TAATGCTATT 
CCGATTGAAA 
TTACCX3GCTG 
TGCCCT6QCA 
QTTGTATCAA 
CAAAACGG6C 
CCTTTTATCT 
CCTGGTTTTT 
ATT6TAATGT 
TATTTGCTTT 
ATATTATAAA 
AQTAAGTTGA 
AATGTTTAAC 
TTTGCCTAAG 



31 
I 

GGCGCCCCCA 
CCftGACTGCA 
GA6AGJU3GAO 
TCAGGAGGAG 
CTGCTGCTCT 
TTGTTAGCAT 
TAOQGCTGGA 
TTTGGTQAGT 
TGCAGTCAAG 
AATACACAOG 
AOGTGTGTGA 
GAAGAAGGGC 
GACTGTCTAG 
TGTGTGAACA 
ATCAGTGGAC 
AGCCACCATG 
TATAAAGGCA 
AGAGCACCTG 
AAAAAOAAGG 
CRTGAACTTOC 
GCTAAAAAAG 
AAAGCCCTGA 
GTGAATGAAQ 
CTGGAACATA 
TGGAAACAGG 
GGCTTCTATA 
CTTCTCCTAC 
GCCGGAGACA 
TGGGAGAAGA 
GGAACTOATO 
GAAATCGCAO 
GTGGATGACT 
TTGATATTGC 
ACCAACASAA 
AAATATCATA 
ATATGGAAAT 
TGAGCTTCTC 
TGTTTGACTC 
TGGCTTAGCT 




CGGCACGTCA 
GAA6AAACAG 
GCGTQGGACC 
ATGTQAATGA 
GAAGCTACAA 
ACTCTAGGAC 
CACAGTGCCT 
ATATTGATGA 
CATTTGGAAG 
6ATATGACTG 
CCAATTGCTT 
ATGGACTTCG 
GTACCATCAA 
CAAAAATTAA 
AOCCCTTCAA 
GGAATGAAGA 
AGAATGACAT 
CAGGTGAATT 
AAGATTTAAA 
ATAGAGAAGA 
TGGCAGTTCC 
CT6ACCTGCA 
AAGTCGGGAA 
CCAOGAGTGA 
CTACCAAAAO 
TGGATG6CGT 
GAATGTTACT 
ATCATAGGAC 
ATATTATTGT 
TCftCTGTATC 
GTCAGTTTAT 
TCTACAACAT 
TTATGATACT 
GGGTCTTTCA 



YBITDQyiYM 
AKFLIVDGML 
ISPKSDVWSIi 
OiKROPKQRI 
TI.yEHySGQB 



51 

I 

AGGCCGCGAG 
CCGGTAACTG 
A6CTGCTACG 
CCGTGCGAGA 
A6GTGGTTTC 
GCCTGGGGTC 
CAAGG6AGTC 
AAACAAATGC 
GTGTGQAATG 
GTGCTTTTGC 
ATCTGCCATG 
GTGTCCATCC 
ATGTGCCTCT 
CTACTACTGC 
TATAGATA1A 
CAATACCCAA 
GT6TTCTGCT 
AQACAGAATC 
AAAT6TTACC 
CTATGAAGAO 
6AAAAT6AAA 
A6AGGA6CX3A 
CGGCCTGATT 
TATCTCGGTT 
TGATTTTGAC 
GGCCTTGGCA 
ACXXXAAAGC 
ACTTCQAGTG 
GGATGAAAAO 
CATCATTTTT 
CTTGCTTGTT 
ATCTTTATAT 
CTCTGGCATT 
AAGATGCCTT 
TTCTCAGTCA 
CTCCCCTCCr 
TTCTAGAAAA 
TCTTGGAAAC 
TAGCCAAACT 



1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
640 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
IBOO 
1B60 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 



434 



wo 02/086443 

TGTATATTTA AATTCTTTGT AATAATAATA TCCAAATCAT CAAAAAAAAA AAAAAAAA 



Seg ID NO: 649 Protein Bequence 
Pcotein Accession tf: MP_05€322 

1 11 21 31 41 51 

t I I I I I 

MPLPWSLALP LLLSWVAGGF GNAASARHHG LLASARQPGV anrGTKLACX: YGWRRNSKGV 60 

CBATCEPGCK FGECVGENKC RCPPGYTGKT CSQDVNECGM KPRPCQHRCV NTHGSYKCFC 120 

LSGHMLNPDA TCVNSRTCAM INOQYSCBDT EE6PQCLCPS SGLRLAPNGR DCLDIDECAS 180 

GKVICPYNRR CVNTFOSYYC KCHIGFELQy ISGRYDCIDI NECTMDSHTC SHHANCFNTQ 240 

GSFKCKCKQG YKGNGLRCSA IPEHSVKEVL RAPGTIKDRI KKLLAHKNSM KKKAKIKNVT 300 

PEPTRTPTPK VKLQPFfTYEE IVSRGGNSHG GKKGNEEKMK BGLEDEXRBB KALKNDIBBR 360 

SLRGDVPFPK VNEAGEFGLI LVQRKALTSK LBHKDLNISV DCSPNH6ICD WKQDREDDFD 420 

NKPADRONAI GFYMAVPAUV GHKKDIGRLK LLLPDLQPQS KFCLLFDYRL AGDKVGKLRV 480 

FVKNSMNALA HBKTTSEDEK WKIXSKIQIiYQ GTDATKSZIF BAERGKGK9X3 EZAVDGVLIiV 540 
SGLCPDSLLS VDD 

Seq ID NO: 650 DNA sequence 

Nucleic Acid Accession #: NM_003 506.1 

Coding sequence i 259.. 237 9 

1 11 21 31 41 51 

GCAGCTCCAG TCCCGGACX3C AACCCCGGAG CCGTCTCA6G TCCCTGGGGG GAAOGGTGGG 60 

TTAGACGGGG ACXX3GAAGGG ACAGCGGCCT TCGACOGCCC CCC6AGTAAT TGACCCAGGA 120 

CTCATTTTCA GGAAAGCCTG AAAATGAGTA AAATAGTGAA ATGAGGAATT TGAACATTTT 180 

ATCTTTGGAT GGGGATCTTC TGAGGATGCA AAGA6TGATT CATCCAAGCC ATGTGGTAAA 240 

ATGAGGAATT TGAAGAAAAT GOAaATOTTT ACATTTTTGT TSACBTGTAT TTTTCTACCC 3O0 

CTCCTAAGAG GGCACA6TCT CTTCACCTGT 6AACCAATTA CTGTTCCCAG ATGTATGAAA 360 

ATGGCCTACA ACATGACGTT TTTCCCTAAT CTGATGGGTC ATTATGACCA GAGTATTGCC 420 

GCGGTGGAAA TGGA6CATTT TCTTCCTCTC GCSVAATCTGQ AATGTTCACC AAACATTGAA 480 

ACTTTCCTCT GCAAAGCATT TGTACCAACC TGCATAGAAC AAATTCATGT GGTTCCACCT 540 

TCTOerAAAC TTTGTQAGAA AGTATATTCT QATTQCAAAA AATTAATTGA CACTTTTGGG 600 

ATCOGATGGC CTGAGGAGCT TQAATOTGAC AOATTACAAT ACTGTGATGA 6ACTQTTCCT 660 

GTAACTTTTG ATCCACACAC AGAATTTCTT GGTOCTCAGA AGAAAACAGA ACAAGTCCAA 720 

AGAGACATTG GATTTTGGTG TCX»AGGCAT CTTAA6ACTT CTGGGGGACA AGGATATAAG 780 

TTTCTGGGAA TTGAOCAGTG TGCGCCTCCA TGCGCCAACA TGTATTTTAA AASTGATGAG 84 0 

CTAGAQTTTG CAAAAAGTTT TATTGOAACA GTTTCAATAT TTTGTCTTTG TGCAACTCTO 900 

TTCACATTCC TTACTTTTTT AATTGATGTT AGAAGATTCA GATACCCAGA GAGACCAATT 960 

ATATATTACT CTGTCTGTTA CAGCATTGTA TCTCTTATGT ACTTCATTGG ATTTTTOCTO 1020 

QGCGATAGCA CAGCCTGCAA TAAGGCAGAT GAGAAGCTAQ AACTTGGTGA CACTGTTGTC 1080 

CTAGGCTCTC AAAATAAGGC TTGCACCGTT TTGTTCATGC TTTTGTATTT TTTCACAATO 1140 

GCTGGCACTG TOTOGTOGGT GATTCTTACC ATTACTTGGT TCTTAQCTOC AGGAAQAAAA 1200 

TOGAGTTGTG AAGCCAT06A GCAAAAAGCA GTGTGGTTTC ATGCT6TT0C ATGG6GAACA 1260 

CCAGGTTTCC TGACTGTTAT GCTTCTTGCT CTGAACAAAG TTGAAGGAGA C3iACATTAGT 1320 

GGAGTTTGCT TTGTTGGCCT TTATGACCTQ GATGCTTCTC GCTACTTTGT ACTCTTGCCA 1380 

CTGTGCCTTT GTGTGTTTGT TGGGCTCTCT CnrrCTTTTAG CTGGCATTAT TTCCTTAAAT 1440 

CATGTTOGAC AAOTCATACA ACATGATGGC CGGAACCMG AAAAACTAAA GAAATTTATG 1500 

ATT0GAATT6 GAGTCTTCAG C3GGCTTGTAT CTTGTGCCAT TAGTGACACT TCTOGGATGT 1560 

TACGTCTATG AGCAAGTGAA CAGGATTACC TGGGAGATAA CTTGGGTCTC TOATCATTGT 1620 

CGTCAGTACC ATATCCCAT6 TCCTTATCA6 GCAAAAGCAA AAGCTCGACC AGAATTGGCT 1680 

TTATTTATGA TAAAATACCT 6ATGACATTA ATTGTTGGCA TCTCTGCTGT CTTCTQGGTT 1740 

GQAAGCAAAA AOACATOCAC AGAATOGOCT GGGTTTTTTA AACGAAATC6 CAAGAGAGAT 1800 

CCAATC3kOTG AAAGTCGAAG AGTACTACRG 6AATCATGTG AGTTTTTCTT AAAQCACAAT I860 

TCTAAAGTTA AACACAAAAA GAAGCACTAT AAACXaAGTT CACACAAGCT GAAQGTCATT 1920 

TCCAAATCCA TGGGAACCAG CACAGGAGCT ACAGCAAATC ATGGCaCTTC TGCAGTAGCA 1980 

ATTACTAGCC ATGATTACCT AGGACAAGAA ACTTTGACAO AAATCCAAAC CTCACCAGAA 2040 

ACATCAATGA GAGAGGTGAA AGCGGACGGA GCTAGCACCX: CCAGGTTAAG AGAACAGGAC 2100 

T8TG0TGAAC CTGCCTOQCC AGCAGCATCC ATCTCCAGAC TCTCTGGGGA ACABGTCGAC 2160 

GGGAAGQGCC AQGCAGGCAG TGTATCTGAA AGTGCX3CGGA GTGAAGGAAG GATTAGTCCA 2220 

AAGAGTGATA TTACTGACAC TGGCCTGGCA CAGA6CAACA ATTTGCAGGT CCCCAOTTCT 2280 

TCAGAACCAA GCAGCCTCAA AGGTTCCACA TCTCTGCTT6 TTCACCCAGT TTCAGGA6TG 2340 

AGAAAAGAGC AGGGAGGTGG TTGTCATTCA GATACTTGAA QAACATTTTC TCTCXSTTACT 2400 

CAGAAQCAAA TTTGTGTTAC ACTGGAAGTG ACCTATGCAC TGTTTTGTAA GAATCACTGT 2460 

TACGTTCTTC TTTTGCACTT AAAGTTGCAT TGCCTACTGT TATACTGGAA AAAATAGAGT 2520 

TCAAGAATAA TATGACTCAT 7TCACACAAA GSTTAATQAC AAGAATATAC CTGAAAACAO 2580 

AAATGTGCAG GTTAATAATA TTTTTTTAAT AGT0TGG6AG GACAGA8TTA GAGQAATCTT 2640 

CCTTTTCTAT TTATGAAGAT TCTACTCTTG GTAAGAGTAT TTTAAGATGT ACTATGCTAT 2700 

TTTACCTTTT TGATATAAAA TCAAGATATT TCTTTQCTGA AGTAT TTAAA TCTTATCCTT 2760 

GTATCTTTTT ATAC3VTATTT 6AAAATAAGC TTATATGTAT TT6AACTTTT TTGAAATCCT 2820 

ATTCAAGTAT TTTPATC31TG CTATTGTOAT ATTTTAGCAC TTTQQTA6CT TTTACftCTQA 2880 

ATTTCTAAGA AAATTGTAAA ATAGTCTTCT TTTATACTGT AAAAAAAGAT ATACCAAAAA 2940 

GTCTTATAAT AGGAATTTAA CTTTAAAAAC CCACTTATTO ATACCTTACC ATCTAAAATG 3000 

TGTGATTTTT ATAGTCTCGT TTTAGGAATT TCACAGATCT AAATTATGTA ACTGAAATAA 3060 

GGTGCTTACT CAAAQAGTGT CCACTATTGA TTGTATTATG CTGCTCACTG ATCCTTCTGC 3120 

ATATTTAAAA TAAAATGTCC TAAAGGGTTA GTA6ACAAAA TGTTAGTCTT TTGTATATTA 3180 

G6CCAAGTGC AATTGACTTC CCTTTTTTAA TGTTTCATGA OOVCCXaTTG ATTGTATTAT 3240 

AAOCACTTAC AGTTGCTTAT ATTTTTTOTT TTAA CTTTTO TTTCTTAACA TTTAQAATAT 3300 
TACATTTT6T ATTATACA6T AOCTTTCTCA GACATTTTST JU} 

Seq ID NOt 651 Protein sequence 
Protein Accession »i NP_003497.1 

1 11 21 31 41 51 

I 1 I i I i 
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MEKFTPLLTC IFItPLLRGBS LFTCBPITVP RCMKP4AYNMT FPPNLMGHYD OSIAAVEMEH 60 

FLPLANLECS PNIETPLCKA FVPTCIEQIH WPPCRKLCE KVYSDCKKLI OTFGIRWPBE 120 

LECDRLQYCD ETVPVTPDPH TEPLOPQKKT EQVQRDIGPW CPRHIiKTSGG QGYKFLGIDQ 180 

CAPPCPNMVP KSDELBPAKS FIGTVSIFCL CATLFTPLTP LIDVRRFRYP ERPIIVYSVC 240 

YSIVSLMYFI OPLLGDSTAC NKADEKLELG DTWLGSQNK ACTVLPMIjLY PFTMAGTVWW 300 

VILTITWFLA AGRKWSCEAI EQKAVWFHAV AWOTPGFLTV MLUU^IKVEG 0NI86VCPV6 360 

hYDWhSRXP VLLPI/CZiCVP VGLSLLLAGI ISLNHVRQVI QHDGSHQEKh KKFMXRI6VF 420 

SGLYLVPLVT LIiGCYVYEQV NRITWEITWV SDHCRQYHIP CPYQAKAKAR PELALFMIKY 480 

IjMTLIVGISA VFWVGSKKTC TEWAGFFKRN RKRDPISESR RVLQESCEPP LKHNSKVKHK 540 

KKHYKPSSHK LKVISKSMGT STGATANHGT SAVAITSHDY LGQETLTEIQ TSPETSMREV 600 

KADGASTPRL REQDC6EFAS PAASISRLSG BQVDGKGQA6 SVSBSARSB6 RISPKSOITD 660 
TOLAQSNNLQ VPSSSEPS8L KOSTSLLVHP VSOVRREQQG GCBSDT 

8e<3 ZD NOt 652 DNA sequence 

Nucleic Acid Accession #: 1IM_014791.1 . 

Ooding sequence : 1 71 2 12 6 

1 11 21 31 41 51 

I I I I 11 

TTGGCGG6CG GAAGCGGCCA CAACCCGGCG ATOGAAAAGA TTCTTAGGAA CX3C0GTACCA 60 

GCCGCGTCTC TCAGQACAGC AGGCCCCTGT CCTTCTGTCG GGCXSCOGCTC AlGCOGTQCCC 120 

TCCGCCCCTC AGGTTCTTTT TCTAATTCCA AATAAACTTG CAAGAGGACT ATGAAAGATT 180 

ATGATGAACT TCTCAAATAT TATGAATTAC AT6AAACTAT TQQGACAGGT GQCTTTGCAA 240 

AGGTCAAACT TGCCT6CCAT ATCCTTACTG GAQAGATGGT AGCTATAAAA ATCATGGATA 300 

AAAACACACT AG6GAGTGAT TTGCCCCGGA TGAAAAC3QGA GATTGAGGCC TTGAAGAACC 360 

T6AGACATCA GCATATATGT CAACTCTACC ATGTGCTAGA GACAGCCAAC AAAATATTCA 420 

TGGTTCTTGA GTACTGCCCT GGAGGAGAGC TGTTT6ACTA TATAATTTCC CAGGATCGCC 480 

TGTCAQAAGA GQAGACCCGG GTTGTCTTCC GTCAGATAGT ATCrGCrGTT GCTTATGTGC 540 

ACAGCCAGGG CTATGCTCAC AGGGACXTTCA AGCCA6AAAA T n xaCTGV n' GAT6AATATC 600 

ATAAATTAAA GCTGATTGAC TTTQGTCTCr 6TGC3UUMCC CAAGGOTAAC AAGOATTAOC 660 

ATCTACAGAC AT6CTGTG6G A6TCTG6CTT ATGCAGCACC TGA6TTAATA CAAGGCAAAT 720 

CATATCTTGG ATCAGAGGCA GATGTTTGGA GCATGGGCyiT ACTGTTATAT GTTCTTATGT 780 

GTGGArrrCT ACCATTTGAT GATGATAATG TAATGGCTTT ATACAAGAAG ATTATGAGAG 840 

GAAAATATGA TGTTCCCAAQ TGGCTCTCTC CCAiSTAGCAT TCTGCTTCTT CAACAAATGC 900 

TGCAOOTGOA CCCAAAGAAA CGGATTTCTA TGAAAAATCT ATTGAACCAT OCCTGGATCA 960 

TCCAAGATTA CAACTATCCT OTTQAOTGQC AAAGCRAGAA TCCTTTTATT CACCTCGATG 1020 

ATGATTQCX3T AACAGAACTT TCTGTACATC ACAGAAACAA CAG6CAAACA ATGGAGGATT 1080 

TAATTTCACT GTGGCAGTAT GATCACCTCA CGGCTACCTA TCTTCTGCTT CTAGCCAAGA 1140 

AGGCTCOGGG AAAACCAGTT 0GTTTAAG6C TTTCTTCTTT CTCCTGTGGA CAAGCCAGTG 1200 

CTACCCCATT CACAGACATC AAGTCAAATA ATTGGA6TCT GOAAOATGTO ACOQCAAOTG 1260 

ATAAAAATTA TGT66CGGGA TTAATAGACT ATGATrGGTG TGAAGATGAT TTATCAACAG 1320 

GTGCTGCTAC TCXTCGAACA TCACAGTTTA CCAAGTACTO 6ACAGAATCA AATGGGGTGG 1380 

AATCTAAATC ATTAACTCCA GCCTTATGCA GAACAOCTGC AAATAAATTA AAGAACAAAG 1440 

AAAATGTATA TACTCCTAAG TCTGCTGTAA A6AATGAAGA GTACTTTATG TTT0CTGA6C 1500 

CAAAGACrOC AGTTAATAAG AACX»GCATA AGAGAGAAAT ACTCACIAGO CCAAATOGTT 1560 

ACACTACACC CTCAAAA6CT AOAAACCAOT GGCTOAAAOA AACTCCAATT AAAATACCAC 1620 

TAAATTCAAC AGGAACAGAC AAOTTAATGA C3U3GTGTCAT TA6CCCT6AG AGGOGGTGCC 1680 

GCTCAGTOOA ATTQGATCTC AACCAAGCAC ATATGGAGGA GACTCCAAAA AGAAA6GGAG 1740 

CCAAAGTGTT TGG6AGCCTT GAAA6G0GGT TG6ATAA6GT TATCACTGTG CTCACCAGGA 1800 

GGAAAAQGAA G6GTTCT6CC AGAGAGGGGC CCAGAAGACT AAAGCTTCAC TATAATGTQA 1860 

CTACAACTA6 ATTAOIGAAT CCAGATCAAC T6TTGAATGA AATAAT6TCT ATTCTTCCAA 1920 

AGAAGCATGT TGACTTTGTA CAAAAGG6TT ATACACTGAA GTGTCAAACA CAGTCAGATT 1980 

TTGGGAAAGT GACAATGCAA TTTGAATTAG AA6TGTGCCA GCTTCAAAAA CCOGATGTGG 2040 

TGGG7ATCAG 6AGOCAGC6G CTTAAGG60Q ATGCCTGGGT TTACAAAAGA TTAGTG6AAG 2100 

ACATCCIATC TAGCTGCAAQ GTATAATTQA TGGATTCTTC CATCCTGC08 GAT6A6T6TG 2160 

GGT6TQATAC AGCCTACATA AAGACTOTTA TQATCGCTTT GATTTTAAAG TTCATTGGAA 2220 

CTACCAACTT GTTTCTAAAO AGCTATCTTA AGACCAATAT CTCTTTGTTT TTAAACAAAA 2280 

GATATTATTT TGTGTATGAA TCTAAATCAA GCCCATCTGT CATTATGTTA CTGTCTTTTT 2340 

TAATCATGTG GTTTTGTATA TTAATAATTG TTGACTTTCT TAGATTCACT TOCATATGTG 2400 

AATGTAAGCT CTTAACTATG TCTCTTTGTA ATGTGTAATT TCTTTCTGAA ATAAAACCAT 2460 
TT6T6AATAT 

Seq ID NO: 653 ProCein sequence 
Protein Accession #: NP_055606.l 

1 11 21 31 41 51 

I 1 I k .1 1 

MXDYDELLKV YSLaETlGTG GFAKVXLACH ILTGEMVAIK IMOraTOGSD LFRIXTEIEA 60 

LKNLRHQHIC QLYHVLETAN KXFMVLBYCP GGELFDYIIS QDRIiSEBBTR WFRQIVSAV 120 

AYVH8QGYAH RDIiKPENLLF DEYHKLKLID FGLCAKPKGN KDYHLQTCCG SLAYAAPELI 180 

QGKSYLGSEA DVWSMGILLY VLMGGFLPFD DDNVMALYKK ZMRGKYDVPK WLSPSSILLL 240 

QQMIiQVDPKK RISMKNIiliNH PHIMQDYNYP VSWQSKMPFZ HLDDDCVTEL SVHHRNNRQT 300 

MEDLISLWQY DHLTATYLLL IiAKKARGKFV RLRLSSFSCG QASATPFTDZ XSNNHSLEDV 360 

TAfiOKNYVAG IiIDYDWCEDO ZiSTGAATPRT SQFTRYHTES KGVESKSLTP ALCRTPANKL 420 

KNKBNVYTPK SAVKNEEYPM FPEPKTPVNK NQHKRBILTT PMRYTTPSKA RNQCLKETPI 480 

KIPVNSTGTD KLMTGVISPE RRCRSV^L NQAHMEETPK RKGAKVFGSL ERGLDKVITV 540 

LTR5RRXGSA RDGPRRLKLH YWTTTRIfVN PDQLLNEIMS ILPKKHVDFV QXGYTLKCQT 600 
QSDFQKVTNQ FBLBVCQLQK PDWGIRRQR LKGDANVYKR LVEDILSSCK V 

Seq ID NO: 654 DHA sequence 
Nucleic Acid Accession #} NM_000582 
Coding sequence : 88 * . 990 

1 11 21 31 41 51 

I I 1 I 11 

GCAGAGCACA QCATOGTCGG GACCAGACTC GTCTCAGGCC AGTTGCAGCC TTCTCAGCCA 60 

AACGCCX3ACC AAGGAAAACT CACTACCATG AQAATTGCAG TGATTTGCTT TT6CCTCCTA 120 
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GGC/ITCACCT GT6CCATACC AGTTAAACA6 GCTGATTCTG 
CTTTACAACA AMACCCAGA TGCT6TGGCC ACftlGGCTAA 
CAGAATCTCC TAGCCOCACA GACCCTTOCA AOTAAOTCCA 
GATGATATGG ATGATGAAGA TGATGATGAC CATGTGGACA 
AACGACTCTG ATGATGTAGA TGACACTGAT GATTCTCACC 
TCTGATGAAT CTGATGAACT G6TC3VCTGAT TTTCCCA06G 
TTCACTCCAG TTGTCCCCAC AGTAGACACA TATGATGGCC 
GGACTGAGGT CAAAATCTAA GAAGTTTOGC A6ACCTGACA 
GACGAGGACA TCACCTCACA CATGGAAAGC GAGGAGTTGA 
CCCGTTGCCC AGQACCTGAA OGCOCCTTCT GATTGGGACA 
GAAACGAGTC AGCTGGATGA CCAGAGTGCT GAAACCCACA 
TATAAGGGGA AAGCCAATGA TGAGAGCAAT GAGCATTCCG 
CTTTCCAAAG TCAGCCGTGA ATTCCACAGC CATGAATTTC 
GTTGTA6ACC CCAAAAGTAA GGAAGAAGAT AAACACCTGA 
TTA6ATAGTG CATCTTCTGA GGTCAATTAA AAGGAGAAAA 
ATTTAGTCAA AAGAAAAAAT GCTTTATAGC AAAATGAAAG 
CTCAGTTTAr TGGTTGAATG TGTATCTATT TOAQTCTGGA 
ATTAGTTTA6 TTTGTGGCTT CATGGAAACT CCCTGTAAAC 
CTATGTTCAT TCTATAGAAG AAATGCAAAC TATCAC TGTA 
TCATGAATAG AAATTTATGT AGAA6CAAAC AAAATACTTT 
ATAACATTTT ATGTCACTAT AATCTTTT6T TTTTTAAGTT 
TATCTTTTTG TGGTQTOAAT AAATCTTTTA TCTTGAATGT 
AATTGCTTAT TTGTTTTCCC ACGGTTGTCC AGCRATTAAT 
GCCTAAAAAA AAAAAAAAAA AAAA 

Seq ID NO; 655 Protein sequence 
Protein Accession # : NP_000573 

1 11 



PCT/US02/12476 



GAAGTTCTGA 
ACCCTGACCC 
AGQAAAGCCA 
GCCAGGACrC 
AGTCIGATQA 
ACCTGCCAGC 
GAGGTGATAG 
TCCAGTACCC 
ATGGTGCATA 
GCCGTGGGAA 
GCCACAAGCA 
ATGTGATTQA 
ACA6CCATGA 
AATTTG6TAT 
AATACAATTT 
AGAACATGAA 
AATAACTAAT 
TAAAAGCTTC 
TTTTAATATT 
TACCCACTTA 
AGTGTATATT 
AATAAGAATT 
AAAACATAAC 



GGAAAAGCAG 
ATCTCAGAAG 
TGACCACATG 
CATTQACTCG 
GTCTCACCAT 
AACCGAAGTT 
TGTGGTTTAT 
TGATGCTACA 
CAA6GCCATC 
GGACAGTTAT 
GTCCAGATTA 
TAGTCAGGAA 
AGATATGCTG 
TTCTCATGAA 
CTCACTTTGC 
ATGCJTTCTTT 
GTGTTTGATA 
AGGGTTATGT 
TOTTATTCTC 
AAAAGAGAAT 
TTGTTGTGAT 
TGGTGGTGTC 
CTTTTTTACT 



51 



21 31 41 

I I I I 

MRIAVICFCL LGITCAIPVK QADSGSSBBK QLYNKYPDAV ATHLNPOPSQ KQNItLAPQTL 
PSKSNESHDH MDDMDDEDDD DHVDSQDSID SNDSDDVDDT DDSHQSDESH HSDBSDELVT 
DFPTDLPATE VFTPWPTVD TYDGRGDSW YGLRSKSKKF RRPDIQYPDA TDEDITSHME 
SEBLMGAYKA IPVAQDLNAP SDHDSRGKDS YETSQLDDQS AETHSEKQSR LYKRKANDES 
NBEX8DVZDSQ BLSKVSREFB SHBFHSHEDH LWDPKSKBE DKELKFRZSH ELDSA88BVH 

Seq ID NO I 656 DNA sequence 

Nucleic Acid Accession ft: NN_00310B.l 

Coding sequencet 76.. 1401 



1 
I 

GGGGTGG6AG 
GCCCTGCAAC 
OGGGAGGCGC 
GA6A6CGACC 
TTCATGGTAT 
AACGCCX3AGA 
ATCCCGTTCA 
TACAAOTACC 
CAGAGCCCA6 
GGTGCCAAGA 
GCGGGC6CCA 
GAGGACTAOG 
AOSGTCAAGT 
CAGCTGCA6A 
CTGCAGCCGC 
CCCGCCAGCC 
GAGGTGCGGG 
AAGAACATCA 
TOGCGCTCGG 
GAGGACGCCG 
GCCA6CGAGC 
GATAAGGATT 
TACTGCACGC 
GACCT6GTGT 
AGCTGGGTTC 
ATGATGGTGG 
ATATTGATAA 
TTAAAGTGAA 
TCCTTTATOO 
AAAATGTGTT 
GAGG6G6G66 
GTCQGTCTTT 
TCTAGGGAGT 
TTTTTAACAA 



11 
I 

GGG6AGGGGG 
GQATCATOGT 
TGGACACGGA 

CA6ACTGGTG 
GGTCCAAGAT 
TCTCCAAGAG 
TCOGGQAGGC 
GOCCCCXaGAA 
AGAAGAGCGC 
CCTCCAAGGG 
AG6G6GGCGC 
TGCTGG6CAG 
GaSTGTTTCT 
TCAAACAGGA 
CGGGGCAGCA 
CTAC3GCTGAG 
CCGGCGOOAC 
CCAAGCAGCA 
TGTCCACCTC 
ACGACCTGAT 
AGCAGCTGGG 
TGGATTCGTT 
CGGAGCTGAG 
TCACATATTG 
CTTOGGAGaA 
TGTTGATGGT 
GATOTCGTGA 
ATGAGTAGTT 
TGTCTCAAGG 
TTTGTAATTA 
CX3C66CGGAG 
GAAGTCTGGA 
TGGTGGAGAT 
AAAAAGGG 



21 

I 

ACCTCOGCAC 
GCA6CAGGCQ 
QGAGGGCGAA 
CAAGACGGOS 
CGAACGCAGG 
GCTGGGCAAG 
GGAGGG6CTG 
AAAGGCCAAA 
GGCCGGOSGC 
CTCCAGCAAG 
GGGCAA6GCG 
GCTGOGCGTG 
GGATGAGGAC 
GCCGGACGAG 
GCCGTC6CAG 
CAGCTCGGCG 
CTCX3GGCGCC 
GCCGCCGCCG 
CTGGTCCAGC 
QTTOGACCTG 
GGGC6GCX3GG 
CAGCGAGGGC 
CGAGATGATC 
AAAG606CCC 
AGTTGTAGTG 
GGOGGTGGTA 
CGCAAAGAAA 
TTTAAACATT 
TAGTTGCATA 
CTATTTCTTT 
GGGA6QTAGG 
AGAOGTCTGC 
ATTTTTTTTT 



31 

I 

GAGACCCAGC 
GA8AGCTTG0 
TTCATGGCTT 
TCGGGCCACA 
AAGATCATGG 
G6CTGGAAAA 
GGGCTCAAGC 
ATGGACCCCT 
GGC3GGCX3GGA 
AAAT6066CA 
GCCCAGTCCG 
AGCGGCTGGO 
GA0GAOGA06 
GAGGACGAGG 
CT6CTGAGAC 
GAGTCCCCCG 
GGGGGCGGCA 
CTCGCGCAGC 
AG CA60 GGCA 
AGCTTQAATT 
GCGGCOQGOA 
AGCCTGGGCT 
GOSGGGGACT 
GCTGCTCGCT 
GTOATGATGA 
GGGTGGAGG6 
TTG6AAAACA 
TTTCCTGTCC 
CCTAGTCTGG 
TTCCTGAAAT 
ACCG6CTC0G 
AGAGGACCCT 
CTTAAGA6AA 



41 

I 

GGCCCGGOTT 
AAaOQGAOAG 
GCAOCCCGGT 
TCAAGCX36CC 
AGCAGTCrCC 
TGCTGAAGGA 
ACATGGCCGA 
GGGCXIAAGCC 
GGGCGGG06G 
AGCTCAAGGC 
GGGACTACGG 
GCX3G0G0CGG 
ACGAGGACGA 
AACCACOGCA 
GCTACAAOGT 
AGGGAGCX3AG 
GCCGCCTCTA 
C0GCGCT6TC 
GCAGCAGCGG 
TCTCTCAAAG 
ACCTGTCCCT 
CCCACTTC6A 
GGCTGGAGGC 
CTTTCTCTCG 
TGAXGATGAT 
GAGA6AAGAA 
T6ATGAAAAT 
TTTTTTTGTC 
AGTTGTGATT 
TCGTGATTQC 
QAAQGCGCTQ 
TTTGGCAOCA 
CTTAAAGAAC 



TTTGAAGCTT 
CAACTGTTAC 
TGGTGATTTT 



180 
240 
300 
360 
420 
4B0 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1360 
1440 
1500 



51 

I 

GGAGGGTCCA 

CAACCTGCCC 

GGCCCTGGAC 

GATGAACGCG 

GGACAT6CAC 

CAGCX3AGAAG 

CTACCCC6AC 

CAGCGCCAGC 

AGGCXKX3GGC 

CCCOGGGGCC 

GGGCGCGGGC 

OGCGGGCAAG 

CXSACGAGCTG 

CCAGCAGCTC 

OGGCAAAGTO 

CCTCTACGAC 

CTACAGCTTC 

GCCCGCGTCC 

CAGCAGCGGC 

0GGGCACA6C 

6TG6CTGGTG 

GTTCCCCGAC 

GAACTTCTCC 

GAGGGT6CAG 

AATGATGATG 

GATGCTGATG . 

TTTGGTGGA6 

CCCCCTCCCT 

ATTTTCCC3A 



60 

120 
180 
240 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 



Seq ID NOi 657 Protein 
Protein Accession ft: np_003099.1 

I 11 21 31 41 51 

11)111 
MVQQAESLEA ESNLPREALD TEEGEFKACS PVALDBSDPD HCKTASGHIK RPMNAFMVWS 
KIERRKIMEQ 8PDMHNABIS KRLGKRHKMIj KDSEKZPFIR EAERLRLKHM ADYPDYKYRP 
RKKPKMDPSA KPSASQSPBR SAAGGGGGSA GGGAGGAKTS KGSSKKCGKL KAPAAAGAKA 
GAGKAAQSGD YGGAGDDYVL GSIiRVSGSGG GGAGKTVKCV FLDEDDDDDD DDDELQLQIR 



60 
120 
180 
240 



437 



5 

10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



WO 02/086443 

QBPDBEDBBP FHQQIiIiQPra QQPSQLUUtY HVAKVPASFT LSSSAKSPEQ ASIiYDEVSAG 
ATSQM3068R LVySFRNITK QHPMLAQPA bSPASSKSVS TBSSSSSOSS SOSSOBmDD 
LHFDLSLHF8 Q8ARSASEQQ LGOOAAAGini SLSLVDXDtO S P 880 S MS H PBFF0yCnB 
LSEMIAGDML EANFSDbVTT Y 

Seg ID HOi 6S8 OHA sequence 
Bueleic »cid Accession #■ imj>01719 
CtxUng sequence: 133..141S 



PCTAJS02/12476 



300 
360 
420 



G6GGQCAGCG 
CTQCCACCTG 

OGATGCAOGT 
CCCTGTTCCT 
GCTTCATCCA 
CCATTTTGGG 

ccATxyrrcAT 

GCCAQGGCTT 
GCCT0CAA6A 
TGOAAOVTGA 
TTTCCAAGAT 
ACATCCGGGA 
AGCACTTGGG 
AGGAGGGCTG 
GGCACAACCT 
AGTTGQCG6G 
TCTTCWW5QC 
GCCM3AACCG 
AGAACAGCAO 
6AGACCTGG6 
AGGGGGAGTG 
AGACGCTGGT 
AGCTCAATGC 
ACAGAAACAT 
TTGGG6CCAA 
CTGCCTTTTG 
AAACATGAGC 
TCCTACAAGC 
GOCGGGCCAG 
TTATGAGCGC 
GG6CACATT6 
CAATAAAA06 



11 

I 

GGQCCCOTCT 
GGGCGGTGOQ 
GCGCTCACTO 
GCTGCGCTCC 

ccggcgcctc 
cttgccccac 
gctggacx:tg 
ctoctacccc 

TAGCCATTTC 
CAAGGAATTC 
CCCAGAAGGG 
ACX3CTTCGAC 
CAGGGAATCG 
GCTGGTGTTT 
GGGCCTGCAG 
CXrTGATTGGG 
CACGOAGGTC 
CTCCAAGACX3 
CAGCGACCAO 
CTGGCAGGAC 
TGCCTTCCCT 
CCACTTCATC 
CATCTCOGTC 
GGTGGTGCGO 
GTTTTTCTQO 
7GAGACCTTC 
AGCATAT6GC 
TGT6CAGGCA 
GTCATTOGCT 
CTAOCAGCCA 
GTGTCTGT6C 
AAT6AAT6 



21 
I 

GCAGCAAGTG 
GGGC06GAGC 
CGAGCTGGGG 
GCCCTGQCCG 
CGCAGCCAGG 
GGCCCGCGCC 
TACAAOGCXA 
TACAAGGCC6 
CTCACCGACG 
TTCCACCCAC 
GAAGCTGTCA 
AATGAQAarr 
GATCTCTTCC 
GACATCACAG 
CTCTOGGTQQ 
CGGCACGGGC 
CACTTC05CA 
CCCAAGAAGC 
AGGCAG6CCT 
TGGATCATOG 
CTGAACTCCT 
AACCOGGAAA 
CTCTACTTCG 
GCCIOTGGCT 
ATCCTCCATT 
CCCTCCCTAT 
TTTTGATCAG 
AAACCTAGCA 
GGGAAGTCTC 
6GCCA0CCAG 
OAAAGGAAAA 



31 

1 

ACCGACGQCC 
CCGGA6CCCG 
CGCCX3CACAG 
ACTTCAGCCT 
AGCGGCGGGA 
CGCACCTCCA 
TOGOGGTGGA 
TCTTCAGTAC 
CCGACATGGT 
GCTACCACCA 
0GGCAGCCX3A 
TCOGGATCAG 
TOCTCGACAG 
CCACCAGCAA 
AGA06CTG6A 
CCCAGAACAA 
6CATCCGGTC 
AGGAAGCCCT 
GTAAGAAOCA 
CGCCTGAAGQ 
ACATGAAOGC 
OQGTGOCCAA 
ATGACA6CTC 
QCCACTAGCI 
GCTC6CCTTG 
CCGCAACTTT 
TTTTTCAGTG 
GGAAAAAAAA 
AGCCATGCAC 
C08TGG6AGG 
TTGACCC3GQA 



Seq ZD NOi 659 Protein sequence 
Protein Accession #t NP_0 01710 



HHVRSIiBAAA 
ILGLPERFRP 
LQDSHFLTDA 
IRERFDNETF 
BNLGLQLSVE 
QNR5XTPKNQ 
GECAFPLNSY 
RKMWRACGC 



11 

I 

PHSFVALWAP 



DMVMSFVNLV 
RISVYQVLQE 
TLDGQSINPK 
EAIiRMANVAB 
MNATNHAIVQ 
K 



21 

I 

IiFLXiRSALAD 
MFMLDLYNAM 
EHDKEFPHFR 
HLGRESDLFL 
LAGIiIGRHGP 
NSSSDQRQAC 
TLVHFIHPBT 



31 
I 

FSLDNEVHS5 
AVEEGG6PGG 
YHHREFRFDL 
LDSRTLHASE 
QNKQPFMVAF 
KXBELYVSFR 
VPKPCCAPTQ 



Seq ZD KOi 660 DMA sequence 

NUeleie Acid Accession ft: Bos sequence 

Coding sequence : 211 . . 1895 



1 
I 

GGATCTGAGG 
GGGGCTTGGG 
GAGGAATTAT 
TQATTTTTTT 
GTGCTTTTTC 
CACAGQTTCC 
CTTGTQCTGA 
GAAGGTAATT 
AAAATATCGG 
TTC06ACACT 
GCCAATTATT 
TTCTTTGAAC 
GCTGTG6CTA 
CACATGCACT 
GTAOTCCATG 
CAAAATTCCA 
GTTGTGATGT 
TACCTGCATA 
ATCTTGATAG 
ACTCTGGCTG 
GCACXX5ATCT 
CTAGCTACCA 
AAACTGGCCA 



11 
f 

GGGGCCCAGT 
AGGCAGCCTG 
CTGATAAAAT 
CCCTG6AAAA 
TTTTCTCTTC 
TTGAACAGCT 
AAGCGAAAGT 
GTTTCCCTGA 
CTGTTCCATG 
GTAACCCCAA 
CAGACTGCCT 
GCCTCTATQT 
TTCTCATGAT 
TATTTQTGTC 
CTCACATAGG 
TTGAGGCAAC 
TTATTTACTT 
ATCTCATCTT 
GCTGGGGGTT 
ATGCX3AG6TG 
TAGCAGCTAT 
AAATCTGGGA 
AATCGACACT 



21 
I 

CACTTCCTCC 
CTCTCCAGTC 
TCCTGGGTTA 
TQACCTTTTT 
TTTTTCTA06 
GGATTCTGAT 
ACAATGTGAA 
ATG6GATGGA 
CCCTCCITAT 
TGGAACATGG 
TCGCTTTCTG 
AATGTATACC 
TGGTTACTTC 
TTTCATGCTG 
AOTAAAGOAG 
TTCTGTGGAC 
CCTGGCTACA 
TGTGGCTTTC 
TCCAGCAGCA 
CT6GGAACTT 
TGQGCTGAAT 
GACCAATGCA 
GGTGCTGGTC 



31 

I 

ACGTTCTCGT 
CCTATCCACC 
ATATTTTTAA 
ATGCrrCGAA 
ATAAATGAAA 
GGCACCATTA 
CTCAACATCA 
CTCATTTGTT 
ATTTATGACT 
GATTTTATGC 
CAGCCAQATA 
GTTGGCTACT 
AGACGATTGC 
AGAGCTACAA 
CTG6A6TCCC 
AAATCACAAT 
AATTATTATT 
TTTTCGGACA 
TTTQTTGCAG 
AGTGCTGGAG 
TTTATTCTGT 
GTTGGGCATG 
CTAGTCTTTG 



41 
I 

GGGAOGQCCG 
GGTAGCGC6T 
CTTCGTGGOG 
GGACAACGAG 
GATGCAGCGC 
GGGCAAGCAC 
GGAGGGOSGC 
CCAGOQCCCC 
CATQAGCTTC 
T06AGAGTTC 
ATTCCG6ATC 
OSTTTATCAG 
CQiTACCCTC 
CCACTGGGT6 
TGGQCA6AGC 
GCAGCCCTTC 
CAOGGGGAGC 
60GGATGGCC 
OGAGCTOTAT 
CTACX3C0GCC 
CACCAACCAC 
GCCCTGCTGT 
CAAOGTGATC 
CCTOOQAGAA 
6CCA6GAACC 
AAAGGTGTGA 
GCAGCATCCA 
ACAA OSCATA 
GGACTCXSTTT 
AAOGGOGOST 
AOTTCCTCITA 



41 

I 

FZHRRLRSQB 
QOPSYPYKAV 
SKIPEGBAVT 
EGHLVFDZTA 
FKATBVHFRS 
DLGHQDWZIA 
LNAZSVLYFD 



41 
I 

GCTGGGCGGG 
CACAGGTTTT 
AAAOGGAGAG 
GCAGTTTGTC 
GCMTTCTTC 
CTATAGAGQA 
CAGCTCAACT 
GGCCCAGAGG 
TCAAGCATAA 
ACAGCTTAAA 
TCAGCATAGG 
CCATCTCTTT 
ATTG CACTA G 
GCATCTTTQT 
TAATAATGCA 
ATATOGQGTG 
GGATCCTGGT 
CCAAATACCT 
CATGG6CT6T 
ACATCAAGTG 
TTCTGAATAC 
ACACAAGGAA 
QAGTGCATTA 



51 
1 

CCTGCCCCCT 
AGAGCCGGCX3 
CTCTGGGCAC 
GTGCACTOSA 
GAGATCCTCT 
AACTCX3GCAC 
GG6CCCGGC0 
CCTCTGGCCA 
OTCAACCTOQ 
GGGTTTGATC 
TACAAGGACT 
GT6CTCCAGG 
TGGGCCTCGG 
GTCAATOCGC 
ATCAACCCCA 
ATGGTGGCTT 
AAACA6CGCA 
AAC6TGGCAG 
6TCAGCTTCC 
TACTAGTGT6 
GCCATCGTGC 
GCGCCCACGC 
CIGAAGAAAT 
TTCAOACCCT 
AGCAGACCAA 
6AGTATTAGG 
ATGAACAAGA 
AAGAAAAATG 
CXAOAGGTAA- 
GGCAAGGGGT 
ATAAATOTCA 



51 

I 

RREMQREILS 
PSTQGPPCAS 
AAEFRIYKDY 
TSNHWWKPR 
IRSTGSKQR8 
PBGYAAYYCB 
OSSNVIIifCKy 



51 
I 

AGGAGCX3GAT 
TTGGGTOGGA 
TTTTTAAAAA 
AACCAGCSOA 
AAGAAAAAGG 
6CAGATTGTC 
CCAGGAGGGA 
AACAGTGGGG 
AGQAGTTGCT 
TAAAACATGO 
AAAGCAAGAA 
TGGTTCCTTG 
GAACTATATC 
CAAAGACAGA 
GQATGACCCA 
CAAGATTGCT 
GGAAGGTCTC 
GIGGGGCTTC 
GGCACGAGCA 
GATTTATCAA 
GGTTAGAGTT 
GCAATACAGG 
CATCGTGTTC 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
i020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
IBOO 
1860 



60 
120 
180 
240 
300 
360 
420 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1360 



438 



wo 02/086443 

GTATGCCTGC CTCACTCCTT CACTGGOCTC GGGT6GGAGA TCCGCATGCA CTGTGAGCTC X440 

TTCTTCAACT CCTTTCAOGG TTTCTTTGTG TCTATCATCT ACTGCTACTQ CAATGGAGRG 1500 

OTTCAGGCAG AGQTQAAGAA GATGTGGAGT CGGTGGAATC TCTCCGTGGA CTGGAAAAGG 1560 

ACACCXSCCa-T GTGGCAGCCG CAGATGCGGC TC3W3TGCTCA CCACCGTGAC GCACAGCACC 1620 

AGCftGCCAGT CACAGGTQGC GGCCAGCACA CX3CATGGTGC TTATCTCTGG CAAAGCTGCC 1680 

AAGATCGCCA GCAGACAGCC TGACAGCCAC ATCACTTTAC CTGGCTATGT CT GGAG TAAC 1740 

TCAC3AGCAGG ACTGCCTGCC ACACTCrTTC CACGAGGAGA CCaAGGAAQA TAGTGGGAGG IBOO 

CAGGGAGATG ATATTCTAAT GQAGAAGCCP TCCAGGCCTA TGQAATCTAA COCAGACACT 1B60 
GAAGGATGCC AAGGAGAAAC TGAGGATGTT CTCTOA 



Seq ID NO I 661 Protein sequence 
Protein Accession tti Eos sequence . 

1 11 21 31 41 51 

MLRSSLSTSI VhPhPSSWST INBSISSRICR HRPLEQU>SD GTITIEEQIV LVLKAKVQCE 60 

UWTAQLQBG EQKCTPBMDO LICWPROTVG KISAVPCPPY lYDPNHKGVA PRHCMPNGTW 120 

DPMKSLHKTW AKYSDCLRPL QPDISIGKQE PFERLYVMYT VOYSISPGSI. AVAILIIGYF 180 

RRLHCTRmri HMHLFVSFMI. RATSIFVKDR WHAHIGVKB IiBSLIWQDDP QNSIBATSVD 240 

KSQYIGCKIA WMFIYFIAT NYYWILVEGL YLHNLIFVAF FSDTKYLWGF ILIGWGFPAA 300 

FVAAWAVARA TLADARCWBI. SAGDIKWIYQ APXLAAIGLN FILFLNTVRV LATKIWETNA 360 

VGHDTRKQYR KLAKSTLVLV LVPGVHYIVP VOiPHSPTOI. GWEIRMKCEL FENSPQGPFV 420 

SIIYCYCMGE VQABVKKKWS RWNLSVDMKR TPPC6SRR0G SVLTTVTHST SSQSQVAAST 480 

RMVLZSGKAA KIASRQPDSH ITLPGYVWSM SBQDCLPHSP HEBTKEDS6R QGDDILMBKP 540 
SRFNESNPDT B6CQGETEDV Ii 



Seq ID KOi 6€2 DNA sequence 
Mticleic Acid Accession ft: NM_00504B 
Ooding sequence: 143.. 17 95 

1 11 21 31 41 51 

GGCCGGTGGC CCGOQCCCXJA CCACCCCAGC TGCGCGTCGT TACTGQCCAC AAQTTTGCTC 60 

TGGGCCAGCC AAGTTCGCAA CTTGGAAGCT TCTCCXX3GGC TCTG6AGGAG GGTCCCTGCT 120 

TCTTCCTACA GCCGTTCCGG GCATGGCCGG GCTGQG0GC6 T0GCTCCAC6 TCT6GGGTT0 180 

GCTAATGCTC GGCAGCTGCC TCCTGGCCAG AGCCCAGCTO GATTCTGATG GCACCATTAC 240 

TATAGAGGAG CAGATTGTCC TTGTGCTGAA ^CGAAAGTA CAATGTGAAC TC3UVCATCAC 300 

AGCTCRACTC CAGGAQGGAG AAGGTAATTG TTTCCCTGAA TGGGATGGAC TCATTTGTTG 360 

GCCCAGAGGA ACAQTGGGGA AAATATOGQC TQTTCCATGC CCTCCTTATA T TTATG ACTT 420 

CAACCATAAA GQAGTTGCTT TCC6ACACT6 TAACCCCAAT GOAACATGGO ATTTTATGCA 480 

CAGCTTAAAT AAAACATGGG CCAATTATTC AGACTGCCTT CXSCTTTCTGC AGCXaGATAT 540 

CAGCATAGGA AAGCAAGAAT TCTTTGAACG CCTCTATGTA ATGTATACCG TTGGCTACTC 600 

CATCTCTTTT GGTTCCTTGG CTGTGGCTAT TCTCATCATT GGTTACTTCA GAOGATTQCA 660 

TTCCACTAGG AACTATATCC ACATGCACTT ATTTGTGTCT TTCATCCTQA GAGCTACAAG 720 

CATCTTTGTC AAAGACAGAG TAGTOCATGC TCACATAGGA GTAAAGQA6C TOGRGTCCCT 780 

AATAATCCAG GATGACCCAC AAAATTCCAT TGAGOCaACT TCT6TGC5ACA AATCACftATA 840 

TATCGGGTGC AAGATTGCTG TTGTGATGTT TATTTACTTC CTGG CTAC AA ATTATTATTG 900 

6ATCCTGGTG GAAG6TCTCT ACCTGCATAA TCTCATCTTT GTGGCTTTCT TTTCX3GACAC 960 

CAAATACCTG TGGGGCTTCA TCTTGATAGG CrGGGGOTTT CCAGCAGCAT TTGTTGCAGC 1020 

ATGGGCIGTG GCACGAGCAA CTCTOGCTQA TG08AGQTGG TGGGAACTTA GTGCTGGAGA 1080 

CATCAAGTGG ATTTATCAAG CaCOGATCTT AGCAGCTATT GGGCTGAATT TTATTCTGTT 1140 

TCTGAATACG QTTAGAGTTC TAGCTACCAA AATCTGGGAG ACCAATGCAG TTGG GCAT QA 1200 

CACAAGQAAG CAATACAGGA AACTGGCCAA ATCGACACTG 6TCCTGGTCC TAGTCTTTGG 1260 

AGTOCATTAC ATCGTGTTCG TATGCCTGCC TCACTCCTTC ACTGG6CT0G GGTGGGAQAT 1320 

CCOCATGCAC TOXGAGCTCT TCTTCAACTC CTTTCAGGGT TTCTTTGT6T CTATCATCTA 1380 

CTGCTACTGC AATOGACSAGG TTCAGGCAGA OGTGAAGAAG ATGTGGAGTC GGTGGAATCT 1440 

CTCCGTOGAC TGGAAAAGGA CACCGCCATG TGGCA6CCGC AGATGCGGCT CAGTGCTCAC 1500 

CACCGTQACG CACAGCACCA GCAGCCAGTC ACAGGTGGCG GCCA6CACAC GCATGGTGCT 1560 

TATCTCTGGC AAAGCTGCCA AGATCGCCRG CAGACAGCCT GACA GCCA CA TCACTTTACC 1620 

TGGCTATGTC TCGAGTAACT CAGAGCAGGA CTGCCTGCCA CACTCTPTCC ACGAGGAGAC 1680 

CAAGGAA6AT AGTGGGftGGC A0G6AGATGA TATTCTAATG GAGAAGCCTT CCAGGCCTAT 1740 

GGAATCTAAC CCAGACACTG AAGGATGCCA AG6AGAAACT GAGGATGTTC TCTGAATGGA 1800 

CATTTGTGGC TGACTTTCAT QGGCTGGTCC AATGGCTGGT TGTGTOAGRG GGCTTGGCT6 1860 

ATACTCCTAT GCTTGAGTTC AAAGGCTQAA AATTCAGTTA AGGTGTTACT TAATAATAGT 1920 

TTTTAGGCTC CATGAATTGG CTCCTGTAAA TACTAACGAC ATGAAAATGC AAGTGTCAAT 1980 

QGAOTACTTT ATTACCTTCT ATTGGCATCA AGTTTTCCTC TAAATTAATG TATGGTATTT 2040 

GCTCTOTGAT TOTTCArTTT TTTCTGCTAC TTTTG6GTAG AAAAA AGATT CAATTGCT TG 2100 

GCTGTAGCTT TCTCTCATAT ATATCACCCT AAATATAATG AAQATCTTTT AGTOTGTATC 2160 

ATTTTCCTTT TAGAAACTAG TATTCTCTTA TTTCTTACTP TAATGTACTT CTATCACTQC 2220 

ATTTATrtTG CCTGTGCATA GGAGCAATTA GGATCTAAAA AAATATATGG GAAQATAAAA 2280 

GATCTAA6AA CAAGTACTTG CTGGAAAATT AGTTGGCTGG ACATTGATAA AATAATGCAT 2340 

TTATAACAAT TACATGTGTT TTTGGQAACA AGGAAAATTT CTCAAAAAAG AAT ATTTCA C 2400 

ACATCCCTTC TTTTGAATGG CCi'CTr i OTG ACCAGCCAGA COTCAGGTCT TCACTCTTTC 2460 

TTCTTTGTAA ACCATGTCAT GTGOAAAGAT TTCCTCRGTT AGTGAGCTTG TGTCTGCAAA 2520 

TTGATTTTOT TTQTAATGTA TTTTGATAOC AAATCATGCT GCATCTATAT CTfTTT CTTG 2580 

TTTGftGCTOT TACtACATTG TACATQGCAT GTGGGATCAA TTAAAAATTT OTTTTAAAAA 2640 
T 



8eq ID KO: 663 Protein sequence 
Protein Accession ft: 1IP_005039 

1 11 21 31 41 51 

ilAGLGftSLHV WGWLMLGSCL iARAQUiSDO TITIEBQIVL VLKAKVQCEL NITAQLQEGH 60 

©JCFPEWDGL ICWPRGTVGK ISAVPCPPYI YDPNHK6VAP HHCMPNGTWD FMHSLMICTWA 120 

NYSDCI4RPLQ PDISIGKQEP FERLYVMYTV GYSISPGSLA VAILIIGYFR RLHCTRMYIH 180 

MHLPV8PMLR ATSIPVKDRV VHAHIGVKEL ESLIMQDDPQ NSIEATSVDK SQYIGCKIAV 240 



439 



wo 02/086443 

VMFIYFLATH VYHILVEGLY LHNLIFVAFP SDnCYLHOFI LIQWOPPAAF VAANAVARAT 300 

hMMCHELS AGDIKWiyQA PJLAAIGLNP ZliFLNTVRVL ATKZWB1«AV GBDnUCOYRK 360 

lAKSTLVLVL VFGVBYIVFV CLPHSPTGIiO HBZRMHCBLF FKSFQGFFVS ZIYCYaiGEV 420 

QAEVKKMHSR WNLSVDHXRT PPCSSHRCQS VLTTVTBSTS SQSQVAASTR KVLXS6KAAK 480 

lASRQFDSHI TZiPOyVWSNS EQDCLPRSFB BETKED8(StQ (SIOILMBKPS RFMESNPDTE 540 
GOQGETEDVIi 

8eq ID MOi 664 DNA sequence 
Nucleic Acid Accession #t 1IM_012152 
Coding sequence : 43 . . 1104 

1 11 21 31 41 51 

I I I i I I 

CTTCTTTAAA TTTCTTTCTA GGATGTTCAC TTCTTCTCXa CAATGAATGA GTGTCACTAT 60 

GACAA6CACA TGGACTTTTT TTATAATAGG AGCAACACTG ATACTOTOSA TQACTGOACA 120 

6QAACAAAGC TTGTGATTGT TTT6TGT6TT QGGACGTTTT TCTGCCTGTT TATTTTTTTT 180 

TCTAATTCrC TGGTCATCGC QGCAGTGATC AAAAACAGAA AATTTCATTT CCCCTTCTAC 240 

TACCTGTTGG CTAATTTAGC TGCTGCCGAT TTCTTOGCTG GAATTGCCTA TGTATTCCTG 300 

ATGTTTAACA CAGGCCCAGT TTCAAAAACT TTGACTGTCA AC06CTGGTT TCTCCGTCAG 360 

GGGCTTCTQG ACAGTAQCTT GACTQCTTCC CTCACCAACT TGCTGGTTAT OSCCXTTOGAG 420 

AGGCACATGT CAATCATOAO GATGCGGGTC CATAGCAACC TGACCAAAAA GAGGQTOACA 4B0 

CTGCTCATTT TGCTTOTCTQ GGCCATCGCC ATTTTTATGG GGGOGGTCCC CACACTOGOC 540 

TGQAATTGCC TCTGCAACAT CTCTGCCTGC TCTTCCCTGG CCCCCATTTA CAGC RGQA GT 600 

TACCTTGTTT TCTGGACAGT GTCCAACCTC ATGGCCTTCC TCATCATGGT TGTGGTOTAC 660 

CTG06GATCT AOGTGTACGT CAAGAGGAAA ACCAAOGTCT TGTCTCG8CA TACAAGTGGG 720 

TCCATCA6CC GCOSGAGGAC ACCCAIX3AAG CTAATSAAGA OSOTOATOAC TGTCTTAGGO 780 

GCX3TTTGTGG TATGCTGGAC CCCGGGCCTG 6TGGTTCTGC TCCTOGACGG CCTGAACTGC 840 

AGGCAGTGTG GCX3TGCAGCA TGTGAAAAGG TGGTTCCTGC TGCTGGCGCT GCTCAACTCC 900 

GTCGTGAACC CCATCATCTA CTCCTACAAO GACGAQGACA T6TATGGGAC CATGAA6AAG 960 

ATGATCT6CT GCTTCTCTCA GGAGAACCCA GAOAGGGQTC CCTCTGGCAT CCCCTCCACA 1020 

GTCCTCAGCA GGAGTGACAC AGGCAGGCA6 TACATAGAGG ATA6TATTA0 CCAAGGTGCA 1080 

GTCTGCftATA AAAOCACTTC CTAAACTCTO GATGCCTCTC GGCCCaCCCA GGTGATGACT 1140 
GTCTTAGO 

Seq ZD NOt 665 Pzotsln sequence 
Protein Accession #t NP_036284 

1 11 21 31 41 51 

I III I I 

raiECHYDKHM DFFYMRSNTD TVODNTGnOi VZVI^VGTFP CLFIFFSHSL VIAAVIKNRK 60 

FHFPFYYLLA NIAAADFFAG lAYVFLMFNT GFVSXTLTVN RWFIiRQGIiLD SSIiTASLTNL 120 

LVIAVERHMS IMRMRVHSNL TKKRVTLLZL LVWAIAIF»4G AVPTLGHUCL QTISACSSIiA 180 

PIYSRSYLVP WTVSNLMAPL IMWVYliRIY VYVKRKTNVL SPHTSGSISR RRTPMKLMKT 240 

VMTVIiGAFW CHTPGLWLL IiDGUfCRQOG VQHVKRWFLL LALLNSWNP ZIYSYRDEDM 300 
Y(?IMKKHZOC FSQENPERRP SRIP8TVLSR SDTGSQYIBD 8Z8QGAVC3IK 8T8 

Seq ID NOt 666 DNA sequence 
Nucleic Acid Accession ft: MM_002621 
Coding sequence : 150 . . 3362 

1 11 21 31 41 51 

111(11 

AACTCCCGCC TCGGGACGCC TOGGGGTOGG GCTCCGGCTG CGGCTGCTGC TGCGGCGCCC 60 

GCGCTCCGGT GOGTCOGCCT CCTGTGCCOG COGCGGAGCA GTCTGCGGCC CGCCGTGOGC 120 

CCTCAGCTGC TTTTQCraAO GCOGCOGOQA TGGGAGCTGC GG6GGGATCC CCGGCCAGAC 180 

CCC6CCQGTT GCCTCTQCTC AOCSGTCCTQC TGCTGCOGCT 6CTGGG0GGT ACCCAGACAO 240 

CCATTGTCTT CATCAAGCAO CCGTCCTCCC AGGATGCACT GCAGGGGOGC CGGGOGCTGC 300 

TTCGCTGTGA GGTTQAGGCT COGGGCCOQG TACATGTGTA CTGGCTGCTC GATGGGQCCC 360 

CTGTCCAGGA CACGGAGCOO CGTTTCGCCC AGGGCAGCAG CCTGAGCTTT GCAGCTQTGG 420 

ACOQGCTGCA G6ACTCTGGC ACCTTCCAGT GTGTGGCTCG GGATGATGTC ACTGOAGAAG 480 

AAGCCCGCAG TGCCAACGCC TCCTTCAACA TCAAATGGAT TQAGGCA6GT CCTGT6GTCC 540 

TGAAGCATCC AGCCTCGGAA GCTGAGATCC AGCCACAGAC CCAGGTCACA CTTCGTTGCC 600 

ACATTGATGG GCACCCTOGG CCCACCTACC AATGGTTCCG AQATGGGACC CCCCTTTCTC 660 

ATGGTCAGAG CAACCACACA GTCAGCAGCA AGQAGCGGAA CCTGACGCTC OGGCCAGCTG 720 

GTCCTOAGCA TAGTGGGCTQ TATTCCTGCT QCGCCCACAG TGCTTTTGGC CAGGCTTGCA 780 

GCA6CCAGAA CTTCACCTTG AGCATTGCTG ATGAAAGCTT TGCCAGGGTG GTGCTGGCAC 840 

CCCAGGACGT GGTAGTAGCG AGGTATGAOG AGGCCATGTT CCATTGCCA6 TTCTCAGCOC 900 

AGCCACCCCC GAGCCTQCAG TGGCTCTTTG AG6ATGAGAC TCCCATCACT AACC8CABTC 960 

GCCCCCCACA CCTCCGCAOA GCCACAGTQT TTGCCAAOGG GTCTCTGCTG CTGACCCAGG 1020 

TCCGGCCACQ CAAT6CAG00 ATCTACCOCT GCATTGGCCA 6GGGCAGA6G GGCCCACCCA 1080 

TCATCCTG6A AGCCACACTT CACCTAGCAG AGATTGAAGA CATGCCGCTA TTTGAGCCAC 1140 

GGGTGTTTAC AGCTGGCAGC GAGGAGCGTO TGACCTGCCT TCCCCCCAAG GGTCTQCCAG 1200 

AGCCCAGCGT GTGGTGGGAG CAOGCGGGAG TCCG6CTGCC CAOCCATGGC AGGGTCTACC 1260 

AGAAGGGCCA . OQAQCTQGTO TTGGCCAArA TT6CTGAAAG TGATGCTOGT GTCTACACCT 1320 

GCCAOGCQGC CAACCTGGCT GGTCAGCGGA GACAQGATGT CAACATCACT GTQGCCACTG 1380 

TGCCCTCCTG GCTGAAQAAG CCCCAAGACA GCCAGCTGGA GQAGGGCAAA CCOGGCTACT 1440 

TGGAT7GCCT GACCCA6GCC ACACCAAAAC CTACAGTTGT CZGGTACAGA AACCAGATGC 1500 

TCATCTCAGA GGACTCAGGG TTCGAGGTCT TCAAGAATGG GACCTTGCOC ATCAACAGCG 1560 

TGGAGGTGTA TGATGGGACA TGGTACOGTT OTATQAGCAG CACCCCAGCC GGCAGCATCG 1620 

A6GCGCAAGC CCGTGTCCAA GTGCTGGAAA AGCTCAAGTT CACACCACCA CCCCAGCCAC 1680 

AGCAGTGCAT GGAGTTTGAC AAGGAGGCCA CGGTGCCCTG TTCAGCCACA GGCCGAGAGA 1740 

AGCCCACTAT TAAGTGGGAA OGGGCAGATG GGAGCAGCCT CCCAGAGTGG 6TGACAGACA 1800 

ACGCTGGGAC CCTGCATTTT OCCOGGGIGA CTCGAGATGA COCTGGCAAC TACACTTGGA 1860 

TTGCCTCCAA CGGGCCGCA6 GGCCAGATTC GTGCCCATGT CCAGCTCACT GTGGCAGTTT 1920 

TTATCACCTT CAAAGT6GAA CCAGAGGGTA CGACTGTGTA CCAGGGCCAC ACAGCCCTAC 1980 

TGCAGTGCGA GGCCCAGGGG GACCCCAAGC CGCTGATTCA OTGQAAAGGC AAGGACOGCA 2040 

TCCTGGACCC CACCAAGCTG GGACCCAGGA TQCACATCTT CCAGAATGGC TCCCTGGTGA 2100 



440 



wo 02/086443 

TCCATGACGT GGCCCCTGAQ QACTCAGGCC GCTACACCTQ CATTGCAGGC AACAGCTGCA 2160 

ACATCAAGCA CACX3GAGGCC CCCCTCTATG TCGTGGACAA GCCTGTGCOG GAGGAGTCGG 2220 

AGGGCCCTGG CAGCCCTCCC CCCTACAAGA TGATCCAGAC CATTGGGTTG TCGGTQGGTG 22 BO 

COSCTGTGGC CTACATCATT GCCGTGCTGG GCCTCATGTT CTACTGCAAG AAGCGCTGCA 2340 

5 AAGCCAAGGG QCTGCAGAAG CAGCGCGAGG G06AGGAGCC AGA6ATGGAA TG CCTC AACG 2400 

GAG66CCTTT GCAGAACGGG CAGCCCTCAG CftGAGATCCA AGAAQAAGTG 6CCTTGACCA 2460 

GCTTGGGCTC CGGCCCCGCG GCCACCAACA AACX3CCACAG CACAAGTGAT AAGATGCACT 2S20 

TCCCACGGTC TAGCCTGCAG CCCATCACCA CGCTGGQGAA GAGTGAGTTT GGGGAGGTGT 2580 

TCCTGGCAAA GGCTCAGG6C TTGGAGGAGG GAGTGGCA6A aACCCTGGTA CTTGT GAAGA 2640 

10 GCOKSCAGAC GAAGQATOAO CAGCAQCAGC TGGACTTC08 QAGGQAGTTG GAGATGTTTQ 2700 

QGAAGCTGAA CCAOGCCAAC QTGGTGCGGC TCCTGGGQCT GTGCCGGGAG GCTGAGCCCC 2760 

ACTACATGGT GCTGOAATAT GTGGATCTGG GAGACCTCAA OCRGTTCCTG AG6ATTTCCA 2820 

A6AGCAAGQA TGAAAAATTG AAGTCACAGC CCCTGAGCAC CRAGCAGAAG GTGGCOCTAT 2080 

6CACGCAGGT AGCCCTGGGC ATGGAGCACC TGTCCAACAA COGCTTTGTG CATAAGGACT 2940 

15 TGGCTGGGOG TAACTGCCTG GTCAGTGCCC A6AGACAAGT GAAGGTGTCT GCCCTGGGOC 3000 

TCAQCAA6GA TGTGTACAAC AGtOAGTACT ACCACTTC06 CCAGGCCTGG GTGCCGCTGC 3060 

GCTGGATGTC CCCOGAGGCC ATCCTGGAGG GTGACTTCTC TACCAAGTCT GATGTCTGGG 3X20 

CCTTCGGTGT GCTGATGTGG GAAGTGTTTA CACATGGAGA GATGCCCCAT GGTGGGCAGG 3180 

- CAGATGATGA AGTACTGGCA GATTTGCAGG CTGGGAAGGC TAGACTTCCT CAGCGCGAGG 3240 

20 GCTGCCCTTC CAAACTCTAT CGQCTGATGC AGCGCTGCTO GGCCCTCAGC CCCAAGGACC 3300 

GGCCCTCCTT CAGTGAGATT GCCAGCGCCC TQGGAGACAG CACCGTGGAC AGCAAGCCOT 3360 

GAGGAGGGAG CCCGCTCAGG ATGGCCTGGG CAGGGQAGGA CATCTCTAGA GGGAAQCTCA 3420 

CAGCATGATG GGCAAGATCC CTGTCCTCCT GGGCCCTGAG GTGCCCTAGT GCAACAGGCA 34B0 

TTGCTGAGGT CTGAGCAGGG CCTQGCCTTT CCTCCTCTTC CTCACCCTCA TCCTTTGGGA 3540 

25 QGCSGMCTTQ GACCCAAACT GGGOSACTaS GGCTTTGAGC TGGGCAGTTT CCCCTGCCAC 3600 

CTCTTCCTCT ATCAGGGACA GTQTGGGTGC CACAGGTAAC CCCAATTTCT GGCCTTCAAC 3660 

- TTCTCCCCTT GACGGGGTCC AACTCTGCCA CTCATCTGCC AACTTTGCCT GQGGAGGGCT 3720 

AG6CTTGGGA TGAGCTGGGT TTGTGGGGAG TTCCTTAATA TTCTCAAGTT CTGGGCACAC 3780 

AGOGTTAATG AGTCTCTTGC CCACTGGTCC ACTTGGGGGT CTAGACCAGG ATTATAGAGG 3840 

30 ACACAGCAAO TGAGTCCTCC CCACTCTGGG CTTGTGCACA CTGACCCAGA CCCA CGTCTT 3900 

CCCCACCCTT CTCTCCTTTC CTCATCCTAA GTGCCTGGCA GATGAAGGAG TTTTCAGGAG 3960 

C TTTTGACAC TATATAAACC GCCCTTTTTG TATGCACCAC GGGC3GGCTTT TATATGTAAT 4020 

TOCAGCGTGQ GGTGGGTGGG CATGGGAGGT AGGGGTGGGC CCT GGAQA TQ AOGAOGOTOS 4080 

GCCATCCTTA CCCCACACTT TTATTGTTGT CGTTTTTTGT TTGTTTTOTT TTTTTOTTTT 4140 

35 Tb " r ' m"iG ' n ' tttacactcg ctgctctcra taaataagcc tttttta 



PCTAJS02/12476 



Seq ID NO I 667 Protein eeguence 
Protein Accession #t llPj002&12 



40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



MGAARGSPAR 
VHVYWIiLDGA 
IKWIEAGPW 
KERNLTLRPA 
EANFHCQPSA 
CIGQGQRGPP 
VRIiPTHGRVY 
8QI£EGKPGY 
CKSSTPAGSI 
GSSLPEWVTD 
TTVYQQHTAL 
RYTCXAGHSC 
GLHFYCKRRC 
KRH8TSDKMH 
LDFRRELEMF 
PLSTKQKVAL 
YHFRQAHVPL 
A6KARLPQPE 



11 

I 

PRRLPLLSVL 
PVQDTERRPA 
LKHPASEAEI 
GPEHSGLYSC 
QPPPSLQWLF 
IZLEATLHLA 
QK6HELVLAH 
LDOiTQATPK 
EAQARVQVLE 
NAGTLHFARV 
LQ CEAQG DPK 
NIKBIEAPLY 
KAKRIiQXQPE 
FPRS5LQPZT 
GKIiNHA2IWR 
CTQVALGMEH 
RHHSPBAILB 
GCPSKLYRLM 



21 

I 

LliFIiLGGTQT 
QGSSLSFAAV 
OPQTQVTLRC 
CAHSAPGQAC 
EDETPITWS 
EIEDMPItPEP 
lABSDAGWT 
PTWWYRNQM 
KLKPTPPPQP 

PLIQWKGKDR 
VVDKPVPGB8 



31 
I 

AIVFIKQPSS 
DRLQD8QTFQ 
HIOGHPRPTV 
SSQNFTLSIA 
RPPHLRRATV 
RVFTAGSEER 
CBAANIjAOQR 
LI8EDSRFEV 
QQCMEFDKEA 
lASNQPQGQI 
ILDPTXLGPR 



TLGKSEFGEV 
IiLGLCREAGP 
LSNNRFVHKD 
GDFSTKSDVW 
QRCHALSPKD 



GGPLQNGQP6 
FLAKAQGLEE 
HYMVLEYVDL 
LAASNCLV8A 
AFGVUWBVF 
RP8FSEIASA 



41 
I 

QDALQGHRAL 
CVARDDVTGE 
QWFRDGTPLS 
DBSFARWLA 
PANGSLUiTQ 
VTCIiPPKGZiP 
RQDVNZTVAT 
FKNQTLRZNS 
TVPCSATGRE 
RAHVQLTVAV 
HHIFQNGSLV 
MZQTZGtiSVG 
AEIQEEVALT 
GVABTLVLVK 
GDLKQFLRIS 
QRQVKVSALG 
THGBMPHGGQ 
LGDSTVDSKP 



Seg ID HO: 668 DNA sequence 

Nucleic Acid Accession #t Eos sequence 

coding sequence: 1..1389 



AT666CTACC 
ACGCTTOTTT 
GTTGTCAACT 
GGGTTTCCTT 
GTTTTATTGA 
AAAACTTTCG 
ATAGCAATGA 
ATCCCAGGAG 
ACAGTTACCT 
TCCCTCATCT 
TCACTGQGTC 
ATTCAAGCGG 
TACAGTTCTC 
GTGATTTCTG 
TTCACCCAAG 
AGATTTTGTT 
GAGGTAATTG 
ACAGTGATGG 
GTTCTAGAAC 
TGTTATCTGA 
ATQCTTCCCA 



11 
I 

AGAGGCAGGA 
CTGAACATGA 
OSATTATAGG 
TGGGAATATT 
TAAAAGGAGG 
GCTTTCCAGG 
TAAGTTACAA 
TTGATCCTGA 
TTACTCTGCC 
CTACAGGTTT 
CACACATACC 
TCGQGGTTAT 
TAGAAGAACC 
TATTTATCTG 
GGGACTTAT7 
ATGGTGTCAC 
CCAATGTGTT 
TCATCACTGT 
TCAATGGTGT 
AACTGTCTOA 
TTGGTGCTGT 



21 

I 

GCCTGTCATC 
GTATAAMjAG 
ATCTGGTATA 
GCTTTTATTC 
GGCCCTCTCT 
GTATCTGCTC 
TATAATAGCT 
AAAOOTGTTT 
TTTATCCTTG 
AACAACTCTG 
AAAAACAGAA 
GTCTTTTGCA 
CACAGTAGCT 
TATATTCTTT 
TCAA AATTft C 
T6TCATTTTQ 
TTTTGGTGGG 
AGCCACGCTT 
GCTCTGTGCA 
AOAACCAAGG 
GGTGATQGTT 



31 
1 

CC6CCGCAGA 
AAAACCTGTC 
ATAGGATTGC 
TGGGTTTCAT 
GGAACAGATA 
CTCTCTGTTC 
GGAGATACTT 
ATTGGTCGCC 
TACGGAAATA 
ATTCTTG6AA 
GACGCTTG6G 
TTTATTTGCC 
AAGTGGTCCC 
GCTACATGTG 
TGCAGAAA7Q 
ACATACCCTA 
AATCTTTCAT 
GTGTCATTGC 
ACTCCCCTCA 
ACACACTCCG 
TTTG6ATT0G 



41 

I 

GA6ATTTAGA 
AGTCTGCTGC 
CTTATTCAAT 
ATGTTAOGGA 
CCTAC CAGTC 
TTCAGTTTTT 
TGAGCAAAGT 
ACTTCATTAT 
TAGCAAAGCT 
TT6TAATGGC 
TATTTGCAAA 
ACCATAACTC 
GCCTTATCCA 
GATACTTGAC 
ATQAOCTGGT 
TG6AATGCTT 
CG6TTTTCCA 
TQATTGATTG 
TTTTTATCAT 
ATAAGATTAT 
TCATGGCTAT 



51 
1 

ItRCBVGAPGP 
BARSANASFN 
DGQSNHTV3S 
PQDVWARYE 
VRPRNAGIVR 
BPSVWWEHAG 
VPSNIiKKPQD 
VEVYDGTIfXR 
XPTIKHBRAD 
PITFKVBPESl 
IHDVAPEDSG 
AAVAYIIAVL 
SLGSGPAA7N 
SLQTKDEQQQ 
K8KDEKLKSQ 
LSKDVYNSBY 
ADDEVIiADLQ 



51 
I 

TCACAGAQAA 
TCTTTTTAAT 
GAAGCAAGCT 
CTTTTCCCTT 
TTTQGTCAAT 
GTATCCmr 
TTTTCAAAGA 
TGGACTTTCC 
TGGAAAGGTC 
AAGGGCftATT 
GCCCAAT6CC 
CTTCTTAGTT 
TATGTCCATC 
ATTTACTGGC 
AACATTTGOA 
TGTGACAA6A 
CATTGTTGTA 
CCTCGGGATA 
TCCATCAGCC 
GTCTTGTGTC 
TACAAATACT 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 



441 



wo 02/086443 

OtAOACIGCA OCCATSGGOV GGAAA1QTTC TACTOCTTTC CTGACAATTT CTCTCTCACA 
AATACCTCflG AGTCTCATGT TCAQCAOACA ACKCMCTTT CTACTTTAAA TATTAGIATC 
TTTCAATQA 



1320 
1380 



PCT/US02/12476 



5 

10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



Seq ID NO I 669 Protein sequence 
Protein Accession #: Eos sequence 



1 
1 

MSyQRQEPVI 
GFFLGILZiIiF 
XAMISYHIIA 
SlilSTGLTTL 
YSSLEBPTVA 
RFCYCVTVIL 
VLEIiNGVXiCA 
QDCTHGQEMF 



11 
1 

PFQSDLDDRS 
WVSYVTDFSL 
GDTLSKVPQR 
ILGIVMARAI 
KW8RLIHMSI 
TYPMECPVTR 
TFLIPIIPSA 
YCPPDMPSLT 



21 
I 

VIiLXXOGAIiS 
IPGVDPENVP 
SIiOPHIPKTE 
VISVPICIFP 
EVIANVPPG6 
CYLKLSEEPR 
NTSBSHVQQT 



31 
i 

KTOQS AALfW 
GTDTYQSLVN 

IGRHFIIGLS 
DAWVFAKPHA 
ATOGYLTPTG 
NLSSVFRIW 
TBSDKIMSCV 
TQLSTLHISI 



41 

I 

VVNSZZ6S6Z 
KTFQPPGYUj 
TVTPTIiPLSL 
IQAVGVMSFA 
FTQGDIiFENY 
TVMVITVATL 
MLPIGAWKV 



51 
1 

ZGZJ>YSMKQA 
LSVLQFLYPP 
YRIflAXIiOKV 
FZaiHNSFLV 
CRKDDLVTFG 
VSLLIDCLGI 
FQFVHAITNT 



Seq ID NO I 670 DNA sequence 

Nucleic Acid Accession Bos sequence 

Goding sequence: 1..1284 



1 
I 

ATQCX3CTACC 
AAGCAAGCTG 
TTTTGCCTTG 
TTGGTCAATA 
TATCCTTTTA 
TTTCAAAGAA 
GGACTTTCCA 
GGAAAOGTCT 
A6GGCAATTT 
CCCAATGCCA 
TTCTTAQTTT 
ATQTCCATCG 
TTTACTQGCT 
ACATTTGQAA 
GTGACAAGAG 
ATTGTTGTAA 
CTC36GGATAG 
CCATCAGCCT 
TCTTGTGTCA 
ACAAATACTC 
TCTCTCACAA 
ATTA6TATCT 



11 
I 

A GRGG CAGGA 
GGTTTCCTTT 
TTTTATTGAT 
AAACTTT036 
TAGCAATGAT 
TCCCAGOAOT 
CAGTTACCTT 
COCTCATCTC 
CACTQSGTCC 
TTCAAGOGGT 
ACAGTTCTCT 
TGATTTCTGT 
TCACCCAAGG 
GATTTTGTTA 
A6GTAATT6C 
CAGTGATGGT 
TTCTAGAACT 
GTTATCTGAA 
T6CTTCCCAT 
AAGACTGCAC 
ATACCTCAGA 
TTCAACTGGA 



21 
I 

GCCTGTCATC 
G6GAATATTG 
AAAAGGAGGG 
CTTTCCAGGG 
AAGTTACAAT 
TOATCCTQAA 
TACTCTGCCT 
TACAGGTTTA 
ACACATACCA 
CGGGGTTATG 
AGAAGAACCC 
ATTTATCTGT 
GGACTTATTT 
TG6TGTGACT 
CAATGTGTTT 
CATCACTGTA 
CAATGGTGtG 
ACT6TCT6AA 
TSGTSCTGTS 
CCATGQGCAO 
GTCTCATQTT 
GTAA 



31 

1 

CXX3CC6CAGA 
CTTTTATTCT 
GCCCTCTCTG 
TATCTCCTCC 
ATAATAGCT6 
AACGTGTTTA 
TTATCCTTGT 
ACAACTCTGA 
AAAACAGAAO 
TCTTTTGCAT 
ACA6TAGCTA 
ATATTCTTTQ 
QAAA ATTAC T 
GTCATTTTGA 
TTTGOTOGGA 
GCCACGCTTG 
CTCTGTGCAA 
GAACCAAGQA 
GTOATGOTTT 
QAAATGTTCT 
CAGCAOACAA 



41 

I 

GAGGATTQCC 
GGGTTTCATA 
GAACAGATAC 
TCTCTOTTCT 
QAGATACTTT 
TTGGTCGCCA 
ACCGAAATAT 
TTCnGGAAT 
ACGCTTGGOT 
TTATTTGCCA 
AGTGGTCCCG 
CTACATGTGG 
GCA6AAATGA 
CSHACOCmT 
ATCTTTCATC 
T6TCATTGCT 
CTCCCCTCAT 
CACACTCOQA 
TT6GATT0GT 
ACTGCTTTCC 
CACAACTTTC 



51 
1 

TTATTCAATG 
TGTTACAGAC 
CTACCAGTCT 
TCASTTTTTG 
GA6CAAAGTT 
CTTCATTATT 
AGCAAAGCTT 
TGTAAT6GCA 
ATTTQCAAAO 
CCATAACTCC 
CCTTATCCAT 
ATACTTQACA 
TGACCTQGTA 
6QAATGCTTT 
GGTTTTCCAC 
GATTGATTGC 
TTTTATCATT 
TAA6ATTATG 
CAXV36CTATT 
TQACAATTTC 
TACTTTAAAT 



Seq ZD KOi 671 Protein sequence 
Pzoteia Accession #< Eos sequence 



1 

I 

MGYQEQEFVI 
LVNKTFf^FG 
GLSTVTPTLP 
PHAI0AV6VM 
FTGFTQGDLF 
IWTVMVITV 
SCVNLPZGAV 
ZSIFQLE 



11 
I 

PPQRGLPYSM 
YLLLSVLQFL 
LStiYRNIAKL 
SFAFICHHNS 
ENYCRMDDLV 
ATIiVSLLIDC 
VNVFGPVMAI 



21 

1 

KQASFPLGIL 
YPFIAMISYN 
GKVSLI8TGL 
FLVYSSLEEP 
TFGRFCYGVT 
LG1VLEUR3V 
TNTQDCTHGQ 



31 
I 

LLFWVSYVTO 
IIAGDTLSKV 
TTLILGIVMA 
TVAKWSRLIH 
VILTYPMECF 
IiCATPLIPZZ 
EMFYCFPDNF 



Seq ID NOt 672 DNA sequence 

Nucleic Acid Accession «t Eos sequence 

Coding sequence t 1 . . 1203 



1 
I 

ATQQGCTACC 
AAA6GAGGG0 
TTTCCAGGGT 
AGTTACAATA 
GATCCTQAAA 
ACTCTGCCTT 
ACAGGTTTAA 
CACATACCAA 
GGGGTTATGT 
GAAGAACCCA 
TTTATCTGTA 
GACTTATTTQ 
QGTQTCACTG 
AATGTQTTTT 
ATCACT3TAG 
AATGGTGTGC 
CTGTCTGAAG 
GGTGCTGTGG 
CATGOGCAGQ 
TCTCAT6TTC 



11 
I 

A0A06CAG6A 
COCTCTCTCG 
ATCTGCTCCT 
TAATAGCTGG 
AC6T6TTTAT 
TATCCTTGTA 
CAACTCTGAT 
AAACAGAAGA 
CTTTTGCATT 
CAGTAGCTAA 
TATTCTTTOC 
AAAATTACTG 
TCATTTTGAC 
TTGGTGGGAA 
CCAOQCTTGT 
TCTGTGCAAC 
AACCAAGGAC 
TGATGGTTTT 
AAATSTTCTA 
AGGAGACAAC 



21 
I 

GCCTGTCATC 
AACAGATACC 

CTCTGTTCTT 
AGATACTTTG 
TGGTCGCCAC 
CG6AAATATA 
TCTTGGAATT 
OGCTTGGGTA 
TATTTGCCAC 
OTGGTCCCGC 
TACATGTGGA 
CAQAAATGAT 
ATACCCTATG 
TCTTTCATCG 
6TCA7TGCTG 
TCCCCTCATT 
ACACTCOGAT 
TGGATTCGTC 
CTGCTTTCCT 
ACAACTTTCr 



31 
I 

C06CGGCAGT 
TACCAGTCTT 
CAGTTTTTGT 
AGCAAAGTTT 
TTCATTATTG 
GCAAAGCTTG 
GTAATGGCAA 
TTTQCAAAGC 
CATAACTCCT 
CTTATCCATA 
TACTTGACAT 
GACCTGGTAA 
GAATGCTTTG 

GrrrrccACA 

ATTGATTGOC 
TTTATCATTC 
AAQATTATCT 
ATGGCTATTA 
GACAATTTCT 
ACTTTAAATA 



41 

I 

FSIiVIiLIKGG 
FQRIPGVDPE 
RAISLGPHIP 
MSIVISVPIC 

vtrevianvp 
psacylklse 
slustsesrv 



41 

I 

TTTCCCTTGT 
TG6TCAATAA 
ATCCTTTTAT 
TTCAAAQAAT 
GACTTTCCAC 
GAAAGGTCTC 
GGGCAATTTC 
CCAAT6CCAT 
TCTTAOTTTA 
TGTCCATOGT 
TTACTGGCTT 
CATTTGGAAG 
TGACAAGAGA 
TTGTTGTAAC 
TCGGGATAGT 
CATCAGCCTG 
CTTGTQTCAT 
CAAATACTCA 
CTCTCACAAA 
TTAOTATCTT 



60 
120 
180 
240 
300 
360 
420 



60 
120 

leo 

240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 



51 
I 

AIiSGTDTYQS 
NVPIGRHPTI 
ICTEDAWVFAR 
IPFATOGYLT 
FGGNLSSVFK 
fiPRTHSDKZM 
QQTTQLSrm 



51 

I 

TTTATTQATA 
AACTTTOGGC 
AGCAATGATA 
CCCAGGAOTT 
AGTTACCTTT 
CCTCATCTCT 
ACPGGGTCCA 
TCAAGC6GTC 
CAGTTCTCTA 
GATTTCTGTA 
CACCCAAGOG 
ATTTTOTTAT 
G6TAATTGCC 
AGTGATGGTC 
TCTAGAACTC 
TTATCTGAAA 
GCTTCCCATT 
AGACTGCACC 
TACCTCA6AG 
TCAACTCGAG 



60 
120 
180 
240 
300 
360 
420 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
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Seq ID NO I 673 Protein sequence 
Protein Accession #i Eos sequence 



PCTAJS02/12476 



1 
I 

MGYQRQEPVI 
SYNIIAGDTL 
TGLTTLILGI 
EBPTVAKWSR 
OVTVILTYPM 
N6VLC7VTPLI 
HGQEMPYCFP 



11 
I 

PPQPSLVLLI 
SKVPQRIPGV 
VMARAISLGP 
XiIHMSXVISV 
BCFVTREVXA 
FIIPSACYIiK 
DNFSLTNTSE 



21 
I 

KGGALSGTDT 
OPENVFXGRE 
HIPRTEDAWV 
PICIPFATOG 
NVFFG<BniSS 
IiSEEPRTKSD 
SHVQQTTQLS 



31 
I 

YQSLVHKTFG 
FIIGLSTVTF 
FAKPNAIQAV 
YLTPTGPTQG 
VFHIWTVMV 
XINSCVNLPI 
TLMTISIFQIiB 



41 
I 

FPGYLLIiSVIi 
TLPLSLYRMI 
GVMSFAFIOi 
DLFENYC3%KD 
ITVATLVSU. 
GAWMVFGFV 



51 
I 

QPLYPFIAMI 
AKLGKVSLIS 
HNSPLVYSSL 
DLVTFGRFCY 
IDGLOIVIiBL 
MAITNTQDCT 



Seq ID NO: 674 I»IA sequence 

Nucleic Acid Accession #: Bos sequence 

Coding sequence t 1..1140 



I 

ATG8GCTACC 
CCAGGQTATC 
TACAATATAA 
CCTGAAAACG 
CTGCCTTTAT 
GGTTTAACAA 
ATACCAAAAA 
GTTATGTCTT 
GAACCCACAG 
ATCTGTATAT 
TTATTTGAAA 
GTC ACTGTCA 
GTGTTTTTTO 
ACTOTAOCCA 
GGT6TGCTCT 
TOTGAAGAAC 
GCTGTGGTGA 
GGGCAGGAAA 
CATGTTCAGC 



11 

I 

AGAGGCAGGA 
TGCTCCTCTC 
TAGCTGGAGA 
TGTTTATTGG 
CCTTGTACCX3 
CTCTGATTCT 
CAGAAGACGC 
TTGCATTTAT 
TAGCTAAGTG 
TCTTTGCTAC 
ATTACTGCAG 
TTTTGACATA 
GTGGGAATCT 
GGCTTGTGTC 
GTQCAACrCC 
CAAGGACACA 
TGGTTTTTGG 
TGTTCTACTG 
AGACAACACA 



21 
I 

GCCTGTCATC 
TGTTCTTCAG 
TACTTTGAGC 
TCGCCACTTC 
AAATATAGCA 
TGGAATTGTA 
TTGGGTATTT 
TTGCCACCAT 
GTCCCQCCTT 
ATGTGGATAC 
AAATQATQAC 
CCCTATGGAA 
TTCATCGGTT 
ATTGCTGATT 
CCTCATTTTT 
CTCXX3ATAAG 
ATTCGTCATG 
CTTTCCTGAC 
ACTTTCTACT 



31 

I 

CCGCCGCAGG 
TTTTTGTATC 
AAAGTTTTTC 
ATTATTGGAC 
AAGCTTGGAA 
ATG6CAAGGG 
GCAAAGCCCA 
AACTCCTTCT 
ATCCATATST 
TTGACATTTA 
CTGGTAACAT 
T6CTTTGTGA 
TTCCACATTG 
GATTGCCTCO 
ATCATTCCAT 
ATTATGTCTT 
GCTATTACAA 
AATTTCTCTC 
TTAAATATTA 



41 
I 

TCAATAAAAC 
CTTTTATAGC 
AAAGAATCCC 
TTTCCACAGT 
AGGTCTCCCT 
CAATTTCACT 
ATGCCATTCA 
TAGTTTACAG 
CCATCX3T6AT 
CTGGCTTGAC 
TTGGAAGATT 
CAAGAGAGGT 
TTGTAACAGT 
GGATAGTTCT 
CAGCCTOTTA 
GTGTCATGCT 
ATACTCAAQA 
TCACAAATAC 
GTATCTTTCA 



51 
I 

TTTCOGCTTT 
AATGATAAGT 
AGGAGTTGAT 
TACCTTTACT 
C3VTCTCTACA 
GGGTCCACAC 
AGOGGTCGGG 
TTCTCTAGAA 
TTCTGTATTT 
CCAA06GGAC 
TTGTTATGGT 
AATTGCCAAT 
GATGGTCATC 
AGAACTCAAT 
TCTGAAACTQ 
TCCCATTGGT 
CTGCACCCAT 
CTCAGAGTCT 
ACT06A6TAA 



Seq ID NO I 67S Protein sequence 
Protein Accession #: Bos sequence 



1 
I 

MGYQRQEPVI 
PENVFIGRHF 
IPKTEDAHVF 
ICIFFATCGY 
VFFGGNLSSV 



HVQQTTQLST 



11 
I 

PPQVNKTPGF 
IIQLSTVTFT 
AKPNAIQAVG 
LTFT6FTQGD 
PHIWTVMVI 
IHSCVMLPI6 
LNISIFQIiB 



21 
I 

PGYLI.LSVLQ 
LPLSLYRNIA 
VMSFAFICBE 
LFBNYCSNDD 
TVATLVSLZiI 
AWNVPGFVM 



31 

1 

FLYPFIAMIS 
KLGKVSLXST 
NSFLVYSSLE 
IiVTFGRFCYG 
DCLGlVhELH 
AITNTQDCTH 



41 
I 

YNIIAGDTLS 
GLTTLIIiGIV 
EPTVAKH8RL 
VTVILTYPME 
GVIiCATPLIP 
GQEMFYCFPD 



51 

1 

KVPQRIPGVD 
MARAISLOFH 
IHMSXVISVF 
CFVTRBVIAN 
IIPSACYLKL 
NPSLINTSBS 



Seq ID NO I 676 DNA sequence 

Nucleic Acid Accession #: NM_00S853.1 

Coding sequence: 2 6.. 87 4 



A6GAATCTGC 
ATCGGGCA6A 
CATGAGGATT 
CA6GATCATC 
CGAGAAGACG 
AGCCCACTGC 
GGAGGGCTGT 
CAGCCTCCCC 
CTCCATCACC 
CAGCTGCCTC 
CTTQOGATGC 
CAACATCACA 
GGGTGACTCC 
CCAG6ATCG0 
G6ACTGGATC 
ACCCTCCATT 
CAA6ACCCTC 
AATCAACCT6 
GACTCTGGGA 
TCCTGGCCAT 



11 

I 

GCTCGGGTTC 
GGTCrCACAG 
CrGCAGTTAA 
AAGGGGTTCG 
CGGCTACTCT 
CTCAAOCCCC 
6AGCAGACCC 
AACAAAGACC 
T6G6CTGTGC 
ATTTCOGGCT 
GCCAACATCA 
GACACCATG6 
6GGG6CCCTC 
TQTGOGATCA 
CAGGAGACGA 
TCCACTTGCrr 
TAC6AACATT 
GGSTTOGAAA 
ATGACAACAC 
ATATCAAGGT 



21 
I 

CGCAGATGCA 
CAGCCAA6GA 
TCCTGCTTGC 
A6TGCAA6CC 
GTGGGGCGAC 
GCTACATAGT 
GQACAGCCAC 
AG06CAATGA 
GAGCCCTCAC 
GGGGCA6CAC 
CCATCATTGA 
TGTGTGCCA6 
TOGTCTGTAA 
CCC8AAAG0C 
TGAAGAACAA 
GTTTGGTrCC 
CTTTGQGCCT 
TCAGTGAGAC 
CTGGTTTGTT 
TTCAATAAAT 



31 

I 

GAGGTTGAGG 
ACCTGGGGGC 
TCTOGCAACA 
TCACTCGCAG 
GCTCATCGCC 
TCACCTGGGG 
TGA6TCCTTC 
CATCATGCTG 
CCTCTCCTCA 
GTCCAGCCCC 
GCACCAGAAG 
OGTGCAGGAA 
CCAGTCTCTT 
T6GTGTCTAC 
TTAGACTGGA 
T6TTCACTCT 
CCTGGACTAC 
CTGGATTCAA 
CTCTGTTGTA 
ATTTGCTAAA 



41 

1 

TGGCTGCGGO 
CGCTCCTCCC 
GGGCTTQTAjS 
CCCTGGCAGG 
CCCAGATGGC 
CAGCACAACC 
CCCCACCCCG 
GTGAAGAT6G 
GGCTQTGTCA 
CAGTTACGCC 
TGTGAGAACG 
GGGGGCAAGG 
CAAGGCATTA 
A06AAAGTCT 
CCCACCCACC 
GTTAATAAGA 
AGGAGATGCT 
ATTCTGCCTT 
TCCCCAfiCCC 
TGAGTG 



60 
120 
IBO 
240 
300 
360 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900' 
d60 
1020 
1080 



60 
120 
180 

240 
300 
360 



51 
I 

AGrGQAAGTC 
CCCTCCAGGC 
GGOSAGAGAC 
CA6CCCTGTT 
TCCTGACAGC 
TCCAGAAGGA 
GCTTCAACAA 
CATCGCCA6T 
CTGCTG6CAC 
TGCCTCACAC 
CCTACCCCGG 
ACTCCTGCCA 
TCTCCTGGGG 
GCAAATATGT 
ACAGCCCATC 
AACCCTAAGC 
GTCACTTAAT 
GAAATATTGT 
CAAAGACA6C 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1060 
1140 



Seq ID NO: 677 Protein sequence 
Protein Accession #: NP_006844.1 

1 . 11 21 31 41 51 

MRILQLILLA LATGLVGGET RIIK6FECKP HSQPWQAALF EKTRLUX3AT LIAPRWLLTA 



443 



wo 02/086443 

AHCUCPRYIV HLGQHNLQKE EGCEQTRTAT BSPPHPGFNN SLPNKDHBND IMLVKMASPV 120 

MTWAVRPLT LSSRCVTAGT SCLISGWQST SSPQLRLPHT LRCANITIIE HQKCENAYPQ 180 

NITOTHVCAS VQEGGKDSCQ GDS6GPLVCH QSLQGIISWO QDPCAITRKP GVYTKVCKYV 240 
DWIQETMKMH 

8eq ID NOt 676 DNA sequence 

Nucleic Acid Accession # : Bos sequence 

Coding sequence* 1..933 

1 11 21 31 41 51 

ItOTQCAGCA LoGACGGTC CATCCCGC3GC OCCTGGCAGT GTGACGQGCT GCCTGACTOC 60 

TTCGACAAGA GTGATGAQAA GGAGTGCCOC AAGC3CTAAGT CGAAATGTGG CCCX5ACCTTC 120 

TTCCCCTOTG CX3W3COGCAT CCATTGCATC ATTGOTCGCT TCCJQQTGCAA TGGGTTTQAG IBO 

GACTGTCCCG ATGGCAGCGA TGAAGAGAAC TGCACAGCAA ACCCTCTGCT TTGCTCCACC 240 

GCCCGCTACC ACTGCAAGAA CGGCCTCTGT ATTGACAAGA GCTTCATCTG CQATGGACAG 300 

AATAACTGTC AAQACAACAG TGAIGAGQAA AGCTGT6AAA QTTCTCAAGA ACCOGG^ 360 

GGGCAGGTGT TTGTGACTTC AGAGAACCAA CTTGTGTATT ACCCCAGCAT CACCTATQCC 420 

ATCATCQ6CA GCTCOGTCAT TTTTGTGCT6 GTG6TGGCCC TGCTGGCACT GGTCTTGCAC 480 

CACCAGCGGA AGOGOAACaA CCTCATOACG CTOCCCJGTaC ACCGGCTGCA GCACCCTGTG 540 

CTGCTGTCCC GCCTGGTGGT CCTGGACCAC CCCCACCACT QCAACGTCAC CTACAACOTC 600 

AATAATGGCA TCCAGTATGT GGCCAGCCAG GCGGAGCAGA ATGCGTCGGA AGTAOQCTCC 660 

CCACCCTCCT ACTCCGAGGC CTTGCTGGAC CAGAGGCCTQ CGTGGTATGA CCTTCCTCCA 720 

COOCCCTACT CTTCTQACAC OOAATCTCTG AACCAAGCCG ACCTGCCCCC CTACCGCTCC 780 

0GGTCC3GGSA GTGCXaACAQ TOOCftCCTCC CAGGCAGCCA GCAGCCTCCT QAGCGTGGAA 840 

GACACCRGCC ACAGCXXXSOQ GCAGCCTGGC CCCCAGGAGG GCACTGCTGA GCCCAGGGAC 900 
TCTGAOCCGA GCCAGGGCAC TGAAGAAGTA TAA 

Seq ID NO: ■ 679 Protein sequence 
Protein Accession S : Eos sequence 

1 11 21 31 41 51 

icSNORCIPO AWQCDGLBDC PDKSDBKECP KAKSKCGPTP PPCASGIHCI IGRFRCNGPE 60 

DCPDGSDBEM CTAHPLLCST ARYHCKMGLC IDKSPICDGQ NNCQDNSDEE SCESSQEPGS 120 

GQVFVTSENQ LVYYPSITYA IIGSSVIFVL WALIAbVLH HQRKRNNLMT I^VmLQHPV 180 

LLSRIiWLDH PHHCNVTYNV NNGIQYVASQ ABQMASEVOS PPSYSEAliD ^MWDLPP 240 

PPYSSDTESL NQADLPPYRS RSGSANSASS QAASSLLSVB DT8HSPGQPG PQEGTABPRD 300 
SEPSQGTEBV 

Seq ID NO: 680 DNA sequence 
Nucleic Acid Accession #: 578203.1 
Coding sequence t 1 . . 2 1 9 0 

1 11 21 31 41 51 

ATGAATCCTT TCCAGAAAAA TGAGTCCAAG GAAACTCTTT TTTCACCTGT CTCCAT TGRA 
GAGGTAOCAC CTCGACCACC TAGCCXTCCA AAGAAGCCAT CTCOGACAAT CTGTOCTCC 

AACTATOCAC TGAGCATTGC CTTCATTGTG GTGAATGAAT TCTGCXSAGOS CTTTTCCTAT 180 

TOT^AT» AAGCTCTGCT GATCCTGTAT TTCCTGTATT TCCTGCACTG GAATCAAGAT 240 

ACCTCCACAT CTATATACCA TGCCTTCAGC AGCCTCT6TT ATTTTACTCC CATCCTGGGA 300 

GCAGCCATTG CTGACTCGTG OTTGOGAAAA TTCAA6ACAA TCATCTATCT CTCCTTGGTG 360 

TATOTGCTTG GCCATGTQAT CAAGTCCTTG GGTGCCTTAC CAATACTGGG AGGACAAGT6 420 

6TACACACAG TCCTATCATT GATCGGCCTG AGTCTAATAG CTTTGGGGAC AGGAGGCATC 480 

AAACCCTGTG TGGCAGCTTT TGGTGGAGAC CAGTTTGAAG AAAAACATQC AGAGGAACGG 540 

ACTAGATACT TCTCAGTCTT CTACCTGTCC ATCAATG CAG GGAGCTTGAT TTCTACATTT 600 

ATCACACCCA TGCTOAGAGG AGATCfTOCAA TQTTTTGGAO AAQACTG CTA TGCATTGGCT 660 

TTTGGAGTTC CAGGACTQCT CATOQTAATT OCACTTCTTG TOTTTOCAAT GGGAAGCAAA 720 

ATATACAATA AACCACCCCC TGAAGGAAAC ATAGTOGCTC AAOTTTTCAA ATGTATCTGQ 780 

TTTGCTArrr CCAATCQTTT CAAGAACCGT TCTGGAGACA TTCCAAAGCG ACAQCACTOa 840 

CTAGACTGGG CAGCTGAGAA ATATCCAAAO CAGCTCATTA TGGATGTAAA GGCACTGACC 900 

AGGGTACTAT TCCTTTATAT CCCATTGCCC ATGTTCTGQO CTCTTTTGGA TCAGCAGGGT 960 

CTTtScA^ CATCAGGATO "20 

CCGGACCAQA TGCAGGTTCT AAATCCCTTT CTGGTTCTTA TCTTCATCCC .GTTGTT^^ 1080 

TTTGTCATTT ATCGTCTGGT CTCCAAGTGT GGAATTAACT TCTCATCACT TAGQAAAATC 1140 

GCTGITGGTA T6ATCCTAGC GT6CCTGGCA TTTGCAQTTO CGGCAGCTGT AGAQMAAAA 1200 

ATAAATOAAA TCQCCXCAGC CCAOTCAGGT OCXXSGGAGG TTTTCCTACA AGTCTTGAAT 1260 

CTGGCRGATO ATGAGGTGAA GGTGACAGTO GTGGOAAATG AAW^ "20 

GAGTCCATCA AATCCITTCA QAAAACACCA CACTATTCCA AACTGCACCT 1380 

AGCCAGGATT TTCACTTCCA CCTGAAATAT CACAATTTGT CTCTCTACAC TGAGCATTCT 1440 

GTGCAGGAGA AGAACTGGTA avGTCTTGTC ATTCCnX^ "OO 

ATQATGOTAA AGGATACAGA AAOCaAAACA ACCAATGGGA TGACAACCGT GAGGTTTGTT 1560 

SJSctS^ ATMAGMOT OACATCTCC CTGAGTACAG ATACCTCrCT CAATGTTGGT 1620 

GAAQACTATG GTCTGTCTGC TTATAGAACT GTGCAAAGAG GAGAATACCC TGWOTGCAC 1680 

TGTAGAACAG AAOATAAQAA CTTTTCTCTG AATTTGG6TC TOTI^^ "40 

TATCTGTTTC TTATTACTAA TAACACCAAT CAGGGTCTTC AGGCCTG6AA GATTOAAGAC 1800 

ATTCCAGCCA ACaAAATGTC CATTGCOTGG CAGCTACCAC AATATGCCCT GQTTACAGCT I860 

SS^SS^ ?OTCTCTGT ScAGGTCTT GAGTTTTCTT ATT^ "20 

AtSaW^ TGCTCCAGGC AGCTTGGCTA TTXSAOUVTTG CA^^^ 1980 

CTTGTTGTGG CACRGITCAG TGGCCTGGTA CAGTGGGCCX5 AATTCATTTT STTTTCCTGC 2040 

CTCCTGCTGG TGATCTGCCT QATCTTCTCC ATCATOGGCT ACTACTATOT TCCTGTAAAG 2100 

SaGMQATA TQCGGGGTCC AGCAGATAAG CACATTCOT 2"° 
AAACTAQAOA CCAAGAAGAC AAAACTCTGA 

Seq ID NO: 681 Protein sequence 
Protein Accession #* AAB3438B.1 



60 
120 



444 



wo 02/086443 
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5 
10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



1 

I 

MHPFQKNESK 
YGMKAVblLY 
YVLGHVIKSL 
TRyFSVFYLS 
lYNKPPPEO? 
RVLFLYIPLP 
PVIYRLVSKC 
IiADDEVKVTV 
VQEKNWYSliV 
EDYGVSAYRT 
XFAKKMSIAH 
LWAQFS6LV 
KLETKRTKL 



11 
I 

ETLPSPVSIB 
FLYFLHWNED 
GAIiPILGGQV 
INAGSLISTF 
IVAQVFKCIW 
MFWALLDQQQ 
GINFSSI.RKM 
VOJENN8LLI 
IREDGNSXSS 
VQRGEYPAVH 
QLPQYALVTA 
QHAEFILFSC 



21 

I 

EVPPRPPSPP 
TSTSIYHAPS 
VHTVLSLIGL 
ITPMLRGDVQ 
FAISNRPKNR 
SRWTLQAIRM 

ESIKSFQKTP 
MMVKDTESKT 
CRTEDXKFSIi 
GEVMFSVTGL 
LLLVICLIFS 



31 
I 

KKPSPTIC6S 
SLCYFTPILG 
SLIALGTGGI 
CFGEDCYALA 
SGDZPKRQBN 
NRNLGFFVLQ 
PAVAAAVBIK 
HYSKLHUCTK 
TKGMTTVRFV 
NLGLLDFSftA 
EFSYSQAPSS 
IMSYYYVPVK 



41 
I 

NYPLSIAFIV 
AAIADSWIiGK 
KPCVAAFGGD 
FGVPGLLMVI 
LDMAAEKYPK 
PDQMQVLNPF 
INHMAPAQSG 
SQDPHPHLKY 
NTIiHKDVNIS 
YIiPVlTinmi 
MKSVZ^AAin. 
TEDMRGPADX 



51 
I 

VKEFCERPSY 
FKTIIYLSLV 
QPBEKHAEER 
ALWFAMGSX 
C2LIMDVKALT 
LVLIFlPIiFD 
PQEVFLQVLM 
HNLSLYTEHS 
LSTDTSLNVG 
QOLOAHKIED 
LTIAVGNIIV 
HZFKIQOIHZ 



Seq ID N0< 682 DMA sequftnce 

Nucleic Acid Accession #t 1IM_016077,1 

Coding sequence: 12 8.. 6 67 



1 
i 

TCQCTTTGTG 
CGOGATAGAA 
ACT6TAGATG 
CTTGGCTGTT 
GATOCTCCCC 
CTTGGGAGAC 
AAAA6GGAAA 
AAGAAGAAAT 
CAAAGCTCCT 
QACTGTAAGT 
OCTAGGGATT 
TTACTAGGTG 
6ATTCTAACA 
AAACCTATTC 



11 
I 

ATTCTTGATC 
ACGTGTTCGC 
CCCTCCAAAT 
G6AGTTGCTT 
AAAAGCAAGA 
AG0G6G6AGT 
GTGGCTGCCC 
CCTGAAATGC 
GATGAAGAAA 
TTAATTCAAG 
GGGCGAGGAC 
GACTTTGATA 
ACAAAAGCTG 
CCATGTTCTA 



21 
I 

GGGAACTTTG 
TTGCCCAGAA 
CCTTGGTTAT 
GTGGCATGTG 
OGAGCAAGAC 
ACAA6ATOAT 
AGTGCTCTCA 
TCAAACAATQ 
CCCTQATTGC 
ATGCTGGAOG 
CAGCAGACCT 
TGACAACAAC 
AATTTCTTCA 
AAAAAA 



31 

I 

TCACCCAflGA 
6AAGGGAAGG 
GGAATATTTG 
CCTGGGCTGG 
ACACACAGAT 
TCTTGTGGTT 
TGCTGCTGTT 
GGAATACTGT 
ATTATTGGCC 
TACTCAGATT 
AATTGACAAA 
CCCTCCATCA 
OCCAACTTAA 



41 

1 

ACCCCGGAAG 
CGCGA6TGAG 
GCTCATCCCA 
AGCCTTCGAG 
ACTGAAAGTG 
CGAAATGACT 
TCAGCCTACa 
GGCCAGOCCA 
CATGCAAAAA 
GCACCAGGCT 
GTCACTGOTC 
CAAGTGTTTG 
ATOTTCTTGA 



51 
1 

AGGTAGCTCA 
GAAAGGAGGT 
6TACACTCGG 
TATGCTTTGQ 
AAGCAAGCAT 
TAAAGATGGG 
AGCAGATTCA 
AGGTGGTGGT 
TGCTGGGACT 
CTCAAACTGT 
ACCTAAAACT 
AAGCCTGTCA 
OATOftAAATA 



Seq ZD MO: 663 Protein sequence 
Protein Accession #: NP_0 5 7161.1 

1 11 21 31 41 51 

1 I I I 1 i 

MPSKSLVNEY LAHPSTLGLA VGVACGMCLG W5LRVCPGML PKSKTSKTHT DTESEASILG 
DSGBYKMILV VRNDLKMGKO KVAAQCSHAA VSAYKQIQRR NPEMUCQWEY CXKJPKVWKA 
PDBBTLIAIiL AHAKMWHiTV SLIQDAGRTQ IAPGSQTVU3 lOPGPADIiID KSTTOHLKLY 

Seq ID NO: 684 DMA sequence 

Nucleic Acid Accession U: NM^004864.1 

Coding sequence: 26.'. 952 



CX3GAACGAGG 
TCAGATGCTC 
GGCCGAGGCG 
ATTC06ASAG 
CTGGGAAGAT 
AGTGCGGCTG 
GGGGCTCCCC 
AAGGT06TGG 
GCCC60GCTG 
ATCTTCGTCC 
CCGCAGAGCG 
TCTGCACAOG 
ACSGGGAGGTG 
CATGCACGCX3 
CTGCTGOGTG 
GTCGCTCCAG 
GGTCCTTCCaV 
GGGCTCAAOa 
TTATTTATTA 
ACTGTGTATT 



11 
I 

GCAACCTGCA 
CTGGTGTTGC 
AOCOGCGCftA 
TTG0G6AAAC 
TCGAACACCG 
GGATCCGGCG 
GAGGCCTCX:C 
GAC3GT6ACAC 
CACCTGCGAC 
GCACGGCCCC 
CGTGCGCGCA 
6TCCGCGCGT 
CAAQTQACCA 
CAGATCAAiQA 
CCCGCCAGCT 
ACCTATGATG 
CTGTGCACCT 
TTCCTGAGAC 
TTAATTTATT 
TATTTiUUUU: 



21 

I 

CAGCCATGCC 
TGGTGCTCTC 
GTTTCCCGGG 
GCTA0GA6QA 

GCCACCTGCA 
GCCTTCACCG 
QACCGCTGCG 
TGTCQCCGCC 
AGCTGGAGTT 
ACGGGGACGA 
CGCrGGAAGA 
TOTGCATCGG 
CGAGCCT6CA 
ACAATCCCAT 
ACTTGTTAGC 
GCGCGGGGGA 
ACCCGATTCC 
GGGGT6ACCT 
TCTGGTGATA 



31 

I 

CGGGCAAGAA 
GTGGCTGCCG 
ACCCICAGAS 
CCTGCTAACC 
GGCCXXTTGCA 
CCTGCX3TATC 
GGCTCTGTTC 
G06TCAGCTC 
GCCGTOQCAG 
6CACTTGCGG 
CTGTCCGCTC 
CCTGGGCTGG 
CGCGTGCCCG 
COGCCTGAAG 
GGTGCTCATT 
CAAAGACTGC 
GGCGACCTCA 
TGCCCAAACA 
TCTTGGGG^kC 
AAAATAAAGC 



41 

I 

CTCAGGACGG 
CATGGGGGC6 
TTGCACTCCG 
AGGCTQCGGG 
GTCCGGATAC 
TCTCGGGCCG 
OGGCTGTCCC 
AGCCTTGCAA 
TCGGACCAAC 
CCGCAAGCCG 
GGGCCCGGGC 
GCCGATTGGG 
AGCCAGTTCC 
CCCGACA06G 
CAAAAGACCQ 
CACTGCATAT 
GITGTCCTGC 
GCTGTATTTA 
TCGGGGGCTG 
TGTCT6AACT 



60 
120 
ISO 
240 
300 
360 
420 
4B0 
540 
600 
660 
720 



51 

I 

TGAATGGCTC 
CCCTGTCTCT 
AAGACTCCAG 
CCAACCAGAG 
TCACGCCAGA 
CCCTTCCCGA 
CGACGGCGTC 
GACCCCAAGC 
TGCTGGCAGA 
CCAGGGGGCG 
GTTGCTGC08 
TGCTGTCGCC 
GGGCGGCAAA 
AGCCAGGGCC 
ACACCGGGGT 
GAGCAGTCCT 
CCTGTGGAAT 
TATAAGTCTG 
GTCTGAT6GA 
GTTAAAAAAA 



Seq ID MO: 685 Protein sequence 
Protein Accession fft NP_004855.1 



11 



60 
120 
ISO 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 



60 
120 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 



21 31 41 51 

I i I I 

MPGQELRTVN GSQMLLVLLV LSWLPHGGAL SbAEASRASP PGPSELHSBD SRFHELRKRY 
EDLLTRLRAN QSWEDSNTDL VPAPAVRILT PEVRLGSGGH LHUIISRAAL PEGLPEASRL 
RRALPRLSPT ASRSHDVTRP LRRQIiSLARP QAPALHLRLS PPPSQSDQLL AESSSARPQIj 
ELBLRPQAAR GRRRARARMG DDCPLGPGRC CRLHTVRASL EDLGWft DWVL SPREVQVTNC 
IGACPSQFRA ANMHAQIKTS LHRI-KPDTEP APCCVPASYN PKVLIQKTDt GVSLQTYDDL 
LAKDCHCI 



60 
120 
180 
240 
300 



Seq ID NO: 686 DMA sequence 
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Niicleic Acid Accession i* l]M_oa2423.2 
coding sequence i 48.. 851 

1 11 21 31 41 51 

] I I i } I 

ACCSU^TCAA CCATAGGTCC AAGAACAATT GTCTCTGGAC G6CAGCTATQ CQACTCACCG 60 

TGCTGTGTGC TGTGTGCCTG CTGCCTGGCA GCCTGSCCCT GCOQCTGCCT CAQGAGGCGG 120 

GAGGCATGAG T6AGCTACAG TGGGAACAGG CTCAGGACTA TCTCAAGAGA TTTTATCTCT 160 

ATGACTCAGA AACAAAAAAT GOCAACAOTT TAGAAGCCAA ACTCAAGGAO ATGCAAAAAT 240 

TCTTTGGCCT ACCTATAACT GGAATGTTAA ACTCCGGCX3T CATAGAAATA AT6CAGAAGC 300 

CCAGATGTGG AQTQCCAQAT GTTGCAGAAT ACTCACTATT TCCAAATAGC CCAAAATGGA 360 

CTTCCAAAGT GGTCACCTAC AGGATOGTAT CATATACTOG AGACTTACCX3 CATATTACAfl 420 

TG6A7CGATT AGT6TCAAA6 6CTTTAAACA TGT6606CAA AOAGATCCCC CTGCATTTCA 480 

OOAAAGTTGT ATG6GGAACT GCTGACATCA TQATTOGCrT TGOGGSAGOA QCTCATQGGG S40 

ACTCCTACCC ATTTGATGGG CCAQQAAACA CQCTGGCTCA TGCCTTTGCG CCTGGGACAG 600 

GTCrCGGAGG AGATGCTCAC TTCGATGA6G ATGAACGCTG GAGGGATGGT AGCAGTCTAG 660 

GGATTAACTT CCT6TATGCT GCAACTCAT6 AACTTGGCCA TTCTTTGGGT ATGGGACATT 720 

CCTCTGATCC TAATGCAGT6 ATGTATCGAA CCTATGGAAA TQGAGATCCC CAAAATTTTA 780 

AACTTTCGCA aGATOATATT AAAOSGATTC AOAAACTATA TQSAAAGAGA AGTAATTCAA 840 

GAAAGAAATA GAAACTTCAO GCAOAACATC CATTCATTCA TTCATT6GAT TGTATATCAT 900 

TGTTOCACAA TCAGAATTOA TAAGCACTGT TCCTCCACTC CATTTAGCAA TTATGTCACC 960 

CTTTTTTATT GCAGTTGGTT TTTGAATGTC TTTCACTCCT TTTATTGGTT AAACTCCTTT 1020 

ATGGTGTGAC TGTGTCTTAT TCCATCTATG AGCTTTGTCA GTGCG0GTA6 ATGTCAATAA 1080 
ATGTTACATA CACAAATAAA TAAAATGTTT ATTCCATGGT AAATTTA 

Seq ID NOt 687 Protein sequence 
Protein Accession #i NP 002414,1 



1 11 21 31 41 SI 

30 I I I ) I I 

MRLTVLCAVC LLPGeiALPL PQEAGGMSBL QWEQAQDYLK RPYIiYDSETK MANSLEAKliK 60 
EMQKFFGLPI TGMLNSRVIE IMQKPRCGVP DVAEySl.FPN SPKWTSKWT YRIVSYTRDL 120 
PHITVDIOiVS KALNHWGKBX PLHFRKVWG TADIMXGFAR GAHGDSYPFD GPGNTLARAF 180 
AF6TSLQGDA HFDEDBaNTD GSSIiGINFLy AATHEUSHSL GKGHSSDFHA VMirPTTGNGD 240 
33 PC2NFKLS0DO IKOZQlOiYGK R6NSRKK 



Seq ID NO; 688 DNA sequence 

Nucleic Acid Accession ft: NM_005221.3 

Coding sequence: 1..870 ~ 

1 11 21 31 41 SI 

I i i i I I 

ATGAGAG6AG TGTTT6ACA6 AAGG6TCCCC AGCATCCGAT CCGGOSACrr CCAAGCTCCG 60 

TTCCAGAOGT CCGCAGCTAT GCACCATC06 TCTCAGOAAT OGCCAACTTT GCCGGAGTCT 120 

TCAGCTAC08 ATTCTGACTA CTACAGCCCT ACGGGOGGAO CCCC6CAGGG CTACTGCTCT 180 

CCTACCTOGG CTTCCTATGG CAAAGCTCTC AACOCCTACC AGTATCAGTA TCACGGCX3TG 240 

AACGGCTCCG CCGGGAGCTA CCCAGCCAAA GCTTATGCCG ACTATAGCTA OGCTAGCTCC 300 

TACCACCA6T ACGGCGGCGC CTACAACCGC GTCCCAAGCG CCACCAACCA GCCAQAGAAA 360 

GAAGTGACCX3 AGCCCGAGGT GAGAATGGTG AATGGCAAAC CAAAGAAAGT TOGTAAACCC 420 

AGQACTATTT ATTCCAGCTT TCAGCTGGCC GCATTACAQA GAAGGTTTCA GAAGACTCAG 480 

TACCTCGCCT TGCCG6AACX3 CGCCGAGCTG GCCGCCTCGC TGGGATTGAC ACAAACACAG 540 

6TGAAAATCT GGTTTCAGAA CAAAAGATCC AAGATCAAGA AGATCATGAA AAACGG6GA0 600 

ATGCCCCCQG AGCACAGTCC CAGCTCCAGC GACCCAATGG GGTGTAACTC GCCGGAGTCT 660 

CCAGCGQTGT GGGAGCCCCA GGGCTCGTCC CGCTCGCTCA GCCACCACCC TCATGCCCAC 720 

CCTCC6ACCT CCAACCAGTC CCCAGCGTCC AGCTACCTGG ACA7VCTCTGC ATCCTGGTAC 780 

ACAA6TGCAG CCAGCTCAAT CAATTCCCAC CTGCCGCCGC OGGGCTCCTT ACAGCACCCO 840. 
CTC6CGCTQG CCTOJSQGAC ACTCTATTAG 



8^ ID NO: 689 Protein sequence 
OU Protein Accession #t np_005212.1 

1 11 21 31 41 51 

I t ( 1 t I 

MTGVPDRRVP SIRSGDFQAP PQTSAAMHHP SQESPTLPBS SATDSDYYSP TGGAPHGYCS 60 
65 PTSASYQKAL NPyQYQYHGV NGSAGSYPAK AYADYSYASS YHQYGGAYNR VPSATWQPBK 120 
EVTEPEVRNV NGXPXKVRRP RTZYSSFQXA ALQRRFQKTQ YLALPERAEIj AASLGLTQTQ 180 
VRIHFQNKRS KIKKIHKNQE MFPEHSPSSS DFMACNSPQS PAVWEPQGSS RSIiSHBpBAH 240 
PPTSNQSPAS SYLENSASWY TSAA8SXNSH LPPPGSLQHP LALASGTLY 
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It is understood that the examples described above in no way serve to limit the true 
scope of this invention, but rattier are presented for illustrative purposes. All publications, 
sequences of accession numbers, and patent applications cited in this specification are herein 
jicotporated by reference as if each individual publication or patent plication were 
specifically and individually indicated to be incorporated by reference. 
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1 1 . A method of detecting a lung cancer-associated transcript in a cell 

2 fi-om a patient, the method comprising contacting a biological sample from the patient with a 

3 polynucleotide that selectively hybridizes to a sequence at least 80% identical to a sequence 

4 as shown in Tables 1 A- 1 6. 

1 2. The method of claim 1, wherein the polynucleotide selectively 

2 hybridizes to a sequence at least 95% identical to a sequence as shown in Tables 1 A-16. 

1 3. The mefliod of claim 1 , wherein the biological sample is a tissue 

2 sample. 

1 4. The method of claim 1 , wherein the biological sample comprises 

2 isolated nucleic acids. 

1 5. The method of claim 4, wherein the nucleic acids are mRNA. 

1 6. The method of claim 4, further comprising the step of amplifying 

2 nucleic acids before the step of contacting the biological sample with the polynucleotide. 

1 7. The method of claim 1 , wherein the polynucleotide comprises a 

2 sequence as shown in Tables lA-16. 

1 8. The method of claim 1 , wherein the polynucleotide is labeled. 

1 9. The method of claim 8, wherein the label is a fluorescent label. 

1 10. The method of claim 1 , wherein the polynucleotide is immobilized on 

2 a solid surface. 

1 11. The method of claim 1 , wherein the patient is undergoing a therapeutic 

2 regimen to treat lung cancer. 

1 12. The method of claim 1, wherein the patient is suspected of having lung 

2 canc^. 

1 13. A method of monitoring the efficacy of a therapeutic treatment of lung 

2 cancer, the method comprising the steps of: 
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3 (i) providing a biological sample from a patient imdergoing the therapeutic 

4 treatment; and 

5 (ii) determming the level of a lung cancer*associated transcript in the 

6 biological sanq>le by contacting the biological sample with a polynucleotide that selectively 

7 hybridizes to a sequence at least 80% identical to a sequence as shown in Tables 1 A-16, 

8 thereby monitoring the efficacy of the therapy. 

1 14, The method of claim 13, further comprising the step of: (iii) comparing 

2 the level of the limg cancer-associated transcript to a level of the lung cancer-associated 

3 transcript in a biological sample from the patient prior to, or earlier in, the therapeutic 

4 treatment. . . 

1 15. The method of claim 13, wherein the patient is a human. 

1 16. A method of monitoring die efBcacy of a therapeutic treatment of lung 

2 cancer, the method comprising the steps of: 

3 (i) providing a biological sanq)le from a patient undergoing the therapeutic 

4 treatment; and 

5 (ii) determining the level of a lung cancer-associated antibody in the biological 

6 sample by contacting the biological sample with a polypeptide encoded by a polynucleotide 

7 that selectively hybridizes to a sequence at least 80% identical to a sequence as shown in 

8 Tables 1 A-16, wherein the polypeptide specifically binds to the lung cancer-associated 

9 antibody, thereby monitoring the efficacy of the therapy. 

1 1 7. The method of claim 16, further comprising the step of: (iii) comparing 

2 the level of the lung cancer-associated antibody to a level of the lung cancer-associated 

3 antibody in a biological sample from the patient prior to, or earlier in, the therapeutic 

4 treatment. 

1 18. The method of claim 16, wherein the patient is a human. 

1 19. A method of monitoring the efficacy of a therapeutic treatment of lung 

2 cancer, the method comprising the steps of: 

3 (i) providing a biological sample from a patient undergoing the therapeutic 

4 teatment;and 
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5 (ii) detennining the level of a lung cancer-associated polypeptide in the 

6 biological sample by contacting the biological sample with an antibody, wherein the antibody 
- 7 specifically binds to a polypeptide encoded by a polynucleotide that selectively hybridizes to 

8 a sequence at least 80% identical to a sequence as shown in Tables lA-16, thereby 

9 monitoring the efficacy of the ther^y. 

1 20. The method of claim 19, further comprising the step of: (iii) comparing 

2 the level of the lung cancer-associated polypeptide to a level of the lung cancer-associated 

3 polypeptide in a biological sample firom the patient prior to, or earlier in, the therapeutic 

4 treatment. 

21 . The method of claim 19, wherein the patient is a human. 

22. An isolated nucleic acid molecule consisting of a polynucleotide 
sequence as shown in Tables 1 A-16. 

23. The nucleic acid molecule of claim 22, which is labeled. 

24. The nucleic acid ofclaun 23, wherein the label is a fluorescent label 

25. An expression vector comprising the nucleic acid of claim 22. 

26. A host cell comprising the expression vector of claim 25. 

27. An isolated polypeptide which is encoded by a nucleic acid molecule 
havmg polynucleotide sequence as shown in Tables 1 A-16. 

28. An antibody that specifically binds a polypeptide of claim 27. 

29. The antibody of claim 28, further conjugated to an effector component 

30. The antibody ofclaim 29, wherein the effector component is a 
fluorescent label. 

31. The antibody of claim 29, wherein the effector component is a 
radioisotope or a cytotoxic chemical. 

32. The antibody of claun 29, which is an antibody firagment 



450 



wo 02/086443 PCT/US02/12476 

1 33. The antibody ofclaim 29, wWch is a humanized antibody 

1 34. A method of detecting a lung cancer cell m a biological sample from a 

2 patient, the method comprismg contacting ttie biological sample with an antibody of claim 

3 28. 

1 35. The method of claim 34, wherein the antibody is further conjugated to 

2 an effector component. 

1 36. The method ofclaim 35, wherein the effector component is a 

2 fluorescent label. 

1 37. A method of detecting antibodies specific to lung cancer in a patient, 

2 the method comprismg contacting a biological sample fix>m the patient with a polypeptide 

3 encoded by a nucleic acid comprises a sequence from Tables 1 A-1 6. 

1 38. A method for identifying a compound that modulates a lung cancer- 

2 associated polypeptide, the method comprising the steps of: 

3 (i) contacting the compound with a lung cancer-associated polypeptide, the 

4 polypeptide encoded by a polynucleotide that selectively hybridizes to a sequence at least 

5 80% identical to a sequence as shown in Tables 1 A-16; and 

6 (ii) determming the fimctional effect of the compound upon the polypeptide. 

1 39. The method of claim 38, wherein the functional effect is a physical 

2 effect • 

1 40. The method ofclaim 38, wh^ein the functional effect is a chemical 

2 effect. 

1 41. The method ofclaim 38, wherein the polypeptide is expressed in a 

2 eukaryotic host cell or cell mraibrane. 

1 42. The method of claim 38, wherein the functional effect is determined by 

2 measuring ligand binding to the polypeptide. 

1 43. The method ofclaim 38, wherein tiie polypq)tide is recombmant. 
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1 44. A method of inhibiting prolifoation of a lung cancer-associated cell to 

2 treat lung cancer in a patient, the method conq>iising the step of administering to the subject a 

3 dierapeutically effective amount of a compound identified using the method of claim 38. 

1 45. The method of claim 44, wherein fbo compound is an antibody. 

1 46. The method of claim 45, wherein the patient is a human. 

1 47. A drug screening assay comprising the steps of 

2 (i) administering a test coni^und to a mammal having lung cancer or a cell 

3 isolated therefix>m; 

4 (ii) comparing the level of gene e3q)ression of a polynucleotide that selectively 

5 hybridizes to a sequence at least 80% identical to a sequence as shown in Tables lA-16 in a 

6 treated cell or mammal with the level of gene expression of the polynucleotide in a control 

7 cell or mammal, wherein a test compound that modulates the level of expression of flie 

8 polynucleotide is a candidate for the treatment of lung cancer. 

1 48. The assay of claim 47, wherein the control is a manmial with lung 

2 cancer or a cell therefix)m that has not been treated with the test compound. 

1 49. The assay of claim 47, wherein the control is a normal cell or mammal. 

1 50. A method for treating a mammal having lung cancer comprising 

2 administering a compound identified by the assay of claim 47. 

1 5 1 . A phaimaceutiPcal composition for treating a mammal having lung 

2 cancer, the composition comprising a compound identified by the assay of claim 47 and a 

3 physiologically acceptable excipient. 
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